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Werner Gautschi, 1927-1959 


By J. R. Buum 
Sandia Corporation 


Werner Gautschi was born on December 11, 1927, in Basel. A serious heart 
ailment suffered as a young boy prevented him from participating in many of 
the usual childhood activities and led to an early devotion to mathematics and 
music. In 1946 he entered the University of Basel and remained there until 1952, 
with the exception of three terms at Cambridge University during 1950-51. 
He graduated summa cum laude from the University of Basel in 1952, with a 
dissertation written under the direction of Professor A. Ostrowski. 

An early interest in Statistics and Computing brought him to the United 
States in 1953 in order to study these fields. He spent his first year here at the 
Institute for Advanced Studies, where he did computational work on eigenvalues 
and norms of matrices. In 1954 he joined the Statistical Laboratory at Berkeley 
for a two year period. Aside from his studies, research, and teaching, he made 
many valuable suggestions to Erich Lehmann who was writing Testing Statistical 
Hypotheses and to Henry Scheffé who was writing The Analysis of Variance. 

In the fall of 1956 he joined the faculty of Ohio State University and in the 
fall of 1957 he came to Indiana University for a two year period. During the 
summer of 1958 he returned to Switzerland where he married Erika Wiist and 


brought her back to the United States. In the summer of 1959 he rejoined Ohio 
State University where he remained until his death on October 3, 1959. A son, 
Thomas, was born on January 25, 1960. 

The death of a good man is a loss to all of us. Werner Gautschi was a good man, 
a fine scientist, and a sensitive pianist. His many friends and colleagues mourn 
him and remember him. 
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THE CAPACITIES OF CERTAIN CHANNEL CLASSES UNDER 
RANDOM CODING'! 


By Davip BiacKweLL, Leo Breiman, anp A. J. THOMASIAN 
University of California, Berkeley 


1. Introduction and Summary. For any two finite sets U, V, a Markov matrix 
8 with row set U and column set V will be called a U, V channel. Thus a U, V 
channel is any nonnegative function s, defined for all pairs (u,v), ue U,ve V, 
for which 


>. (u,v) = 1 for all u. 


The sets U, V will be called the input and output sets, respectively, of the channel. 
We shall denote by M(U, V) the set of all U, V channels. A channel s may be 
thought of as a random device which, on being given an input element u ¢ U, pro- 
duces an output element v ¢ V, with the probability of a particular output v 
given by s(u, v). 

A U, V channel s may be used as a means of communication from one person, 
the sender, to another person, the receiver. There is given in advance a finite set 
D of messages, exactly one of which will be presented to the sender for trans- 
mission. The sender encodes the message by an encoding channel s, e M(D, U), 
with s,;(d, u) being the probability that input u is given to channel s when message 
d is presented to the sender for transmission. When the receiver observes the 
output v of the transmission channel s, he decodes it by a decoding 
channel s, ¢ M(V, D), with s:(v, d) being the probability that, on receiving the 
transmission channel output v, the receiver will decide that message d is intended. 
The pair (8; , 8) will be called a (D, U, V) code. For a U, V channel s and a 
(D, U, V) code c = (8, , 8), the matrix e(s, c) = 8,88. , which is an element of 
M(D, D) will be called the error matrix of code c in channel s. Its (d, d’) element 
is the probability that, when message d is presented to the sender, the receiver 
will decide that message d’ is intended, when code c is used on channel s. We 
shall be especially interested in the average error probability over all messages 
in the set D. This is the number 


x(s,c) = 1 — | D|™ trace e(s, c), 


where | D | denotes the number of elements in the set D. 

A code c = (8, 8) will be called pure if only 0’s and 1’s occur as elements of 
8 , % . The (finite) set of all pure (D, U, V) codes will be denoted by C(D, U, V), 
and a probability distribution k over C(D, U, V) will be called a random 


Received October 21, 1959. 


1 This paper was prepared with the partial support of the Office of Naval Research (Nonr- 
222-53). This paper in whole or in part may be reproduced for any purpose of the United 
States Government. 


558 





CAPACITIES UNDER RANDOM CODING 559 


(D, U, V) code. We define the error matrix «(s, k) and average error probability 
x(s, k) for a random code k by 
e(8,k) = > k(c) e(s8, c), x(s,k) = 1 — | D|" trace e(s, k). 
eeC(D,0,V) 
It was observed by Shannon [4] that every (D, U, V) code c is equivalent to 
some random (D, U, V) code k, in the sense that 


e(8,k) = ¢(8, c) for allse M(U,V). 


The converse is not true. The greater generality of random codes lies in the 
possibility, with random codes, of correlated randomization in the encoding and 
decoding processes. This is a special case of the fact in game theory, noted by 
Kuhn [3], that every behavior strategy (code) is equivalent to some mixed 
strategy (random code), but the converse holds only in games of perfect recall 
(which the communication game is not). 

Shannon’s basic work in information theory [5], and most later work, has been 
concerned with the question: for a given U, V channel s and message set D, is 
there a pure code c which makes the average error probability x(s, c) (or the 
maximum error probability) small? For this question, the distinction between 
pure codes and random codes is irrelevant (though even here random codes are 
useful as tools [5]), since 


r(s,k) = D0 k(c)x(s, c), 


so that there is a pure code whose average error probability is at least as small 
as that of any random code. We shall be concerned with some cases in which 
D, U, V are given, but the transmission channel is known only to be some U, V 
channel in a given closed set S C M(U, V). We ask: is there a random code k 
for which x(s, k) is small for every s ¢ S? For this question, as we shall see, the 
distinction between random codes and pure codes is essential, for some sets S. 

Specifically, we shall be interested in D, U, V, S defined as follows. We are 
given a message set D (only | D |, the number of elements in D, will be relevant), 
an input set A, an output set B, a closed set Sy of A, B channels, and a positive 
integer N. The sender will be given some message d from D, and will then choose 
a sequence u = (a;,--- , ay) of N elements of A. These inputs will be placed 
successively into channels s,,--- , sv, 8 € So, and the receiver will observe 
the resulting output sequence v = (b, , --- , by). The receiver must then estimate 
which message d was presented to the sender. Thus U is the set of all sequences 
u = (a, --- , ay) of length N of elements of A and V is the set of all sequences 
v = (b,---, bw) of length N of elements of B. The set S of possible U, V 
channels will depend on what restrictions we place on the sequences 4 , - - - 
We consider three cases. 

Case 1. Fized unknown channel. Here we are given that the same element of 
So is the transmission channel for each period. There is then one U, V channel s 
for each A, B channel & ¢ S, . The s corresponding to 8 is defined by 


> Sn. 
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N 
s(u,v) = II 8(dn, bn). 
n=l 
We shall denote the set of all such U, V channels by S; . 
Case 2. Arbitrarily varying channel. Here there is one U, V channel for each 
sequence (s,,---, &v) of elements of S, , defined by 


N 
s(u,v) = [] s,(a,, b,). 
n=l 
We shall denote the set of all such U, V channels by ‘:. 

Case 3. Channel selected by jammer with knowledge of past inputs and outputs. 
Here we suppose that the element s, of So which will be the transmission channel 
during the nth period if selected by a jammer after he has observed the inputs 
@,°**, G,, and outputs b, , --- , b,, during the first nm — 1 periods. A pure 
strategy f for the jammer is a sequence (f; , --- , fy) of functions, where f, maps 
every sequence z, = (@,,°** , Gn, y,-+*, ba+) into a corresponding ele- 
ment f,(2,) of So. There is then one U, V channel s; for each pure strategy f, 
defined by 

N 
&(u,v) = II ble. . Pads where s, = fn(2,). 
n=l 
We shall denote the set of all such U, V channels by 8S; . 
Let us define, for i = 1, 2, 3, 
mi(|D\, N, So) = min max z(s, k). 
7 ec8; 
The number z,;(| D|, N, So) is the minimum average error probability which 
can be guaranteed, by using a suitable random code, when there are | D | possible 
messages, N transmission periods, the channel at each period is some element 
of So, and the channel variation from period to period is as described in Case i 
above. It is also the value of the following two-person zero sum game: Player I 
(the jammer) chooses any U, V channel s in S;, and Player II independently 
chooses a pure (D, U, V) code c. A message is then selected at random from D, 
so that each d has probability | D |~* of being selected, and transmitted over 
channel s using code c. If an error is made, Player I wins one unit; otherwise he 
wins zero. 
Since = is linear in s, we have 


xi(|D\|, N, So) = min max x(s, k), 


k aes* 
‘ 


where S; is the convex hull of S;, i.e. the smallest convex set containing S; . 
Let us for the moment denote by T the convex hull of Sp , by 7; the set of U, V 
channels defined by T in the same way that S; is defined by So , and by T; the 
convex hull of 7; . It is not hard to verify that 


S; > T:, Ss D T;, 80 that S| = T7:,8; = Ts . 
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We conclude that 


(1) x(|D\|,N, So) = w(| D|, N, So) for i = 2, 3, 
a fact which will be used later. 
We shall call a number R 2 0 an attainable rate of type i for So if 
w:((2""), N, So) ~Oas N > o. 

The upper bound of the set of attainable rates of type i for So will be called 
the type i capacity of the set Sp and denoted by R,( S,). Thus if R is an attainable 
rate of type i we can, by random encoding in large blocks, transmit R binary 
symbols per transmission period, with small error probability. 

If, in the definition of x; above, we had minimized over pure codes instead of 
random codes, we would have obtained numbers r;(So), which we shall call the 
type i pure capacity of So. The present authors in an earlier paper obtained a 
simple formula for r;(.S,). The principal result of the present paper is that 


R;(So) = Re( So) = Ri (So) = r,(So), 
where Sq is the convex hull of S). In addition we show that always R,(S») = 
r,(So) and give an example in which R;(S9) > 0, re(So) = ra(So) = 0. The 


evaluation of r2(S») and r;(.So) for general Sp remains unsettled. 
We may already conclude from (1) that 


(2) Ri So) = RA Se) fori = 2, 3. 


2. Direct half of principal result. For any random variable X with a finite 
set of values z, we denote by /(X) the random variable whose value when X = z 


is —log, P| X = z}. For any two random variables X, Y, each with a finite set of 
values, we define 


W(X | VY) = 1(X, Y) — 1(Y) 

J(X, Y) 1(X) + 1(Y) — 1(X, Y) 
1(X) — 1(X | Y) 
I(Y) — 1(Y |X). 


I(X) is usually called the information, entropy, or uncertainty in X, I(X | Y) 
the information in X given Y, and J(X, Y) the mutual information in X, Y. 
These concepts, introduced by Shannon [5], are basic in information theory. 
Associated with each probability distribution s on A and A, B channel s is a 
probability distribution P,, on the set A X B of pairs (a, b), defined by 


Pa(a,b) = a(a)s(a, b). 


Let X, Y be the input, output variables on A XK B: X(a,b) = a, Y(a,b) = b 
and define, for any closed subset S C M(A, B), 


H.(S) = min EoJd o(X, Y), 


ee8S 
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H(S) = max H,(S), 
a 


where the subscripts as indicate that expectation and mutual information are 
with respect to Pa. 

TueoreM 1. R;(So) = H(S), where So is the convex hull of So. 

Proor. We shall first suppose Sp finite. It suffices to show that, for any a and 
any number ¢ with 0 < « < H.(So), the number H; = H.(So) — ois an at- 
tainable type 3 rate for Sp . Let 6 be any number for which 0 < | B \é < 1, where 
| B | is the number of elements in B, and let s; be the B, B channel whose non- 
diagonal elements are all equal to 4, so that its diagonal elements are all equal 
to 1 — (| B| — 1)é. Finally, let q be any probability distribution on the finite 
set F of jamming strategies f. 

Let us choose a sequence Xy = (X.,-+:, Xw) of N independent input vari- 
ables, each with distribution a, and let L be a jamming strategy, selected inde- 
pendently of Xy with distribution g. The input sequence Xy and jamming 
strategy L determine a sequence of output variables Yy = (Y; ,-**, Yu). We 
use Y, as an input variable on the B, B channel s; , and let Z, be the resulting 
output variable. Write Z, = (Z,,-°:,Z,),n = 1,--+,N. Then 

PIYy =v i =u} =s(u,v), 8 = Do q(f)s;, 
(3) P{Z, = b| Xn, ¥n,L) = 8s(¥a, 0), 
P{(Xn, Zn) = (a,b) | Xn, Yor, Zea, L} = Pasvs,(a, 5), 
where s* is the element of So selected by L for the nth transmission period when 
the previous input-output history is Xn-1, Ya-1. From (3) we obtain 


P{(Xn,Zn) = (a,b)| X31, Zn} = LP Yea= yb =f| Xan, Zea 
(4) * 7 * 
-P{(X,, Zn) -_ (a, b)| ents Bass or = y; L ™ f} = P ats,(@, b), 
where t = ete : S028 , the convex hull of Sp. 


We shall find an upper bound for P{J(X%, Ze) S NCA, + y)}, where y is a 
positive number less than o. We write 


N 
(Xt, 2) = Dycxt, 2) — xXta, 2a = oI, 


n=1 


n=l 


where 
Ja = 1(X~.) + HS. Zea) — 1(K., B.)| (Xe, Be-a)). 


Let us fix x*, 2* and denote by u the conditional joint distribution of (X, , Z,) 
given Xi. = 2*, Ze, = 2* and by 8 the conditional distribution of Z, given 
Da = z2*. The conditional distribution of J, , given xt, = 2*, 7 = 2* is 
then that of T = 1(X) — log,8(Z) — I(X, Z), where X, Z are the input-output 
variables on A X B and uz is the distribution on A XK B. Now 


T = J(X, Z) + loges’(Z) — loge8(Z), 
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where 8’ is the distribution of Z. Since 
E(logs6’(Z) — loge8(Z)) = — > 08’(z) loge(8(z)/a’(z)) = 0 


(using convexity of —log.), we obtain HT 2 EJ(X, Z). From (4), u is a dis- 
tribution P 4, for some t ¢ So , 80 that, denoting by S? the set of all A, B chan- 
nels of the form ts; , t ¢ Se , we have 


(5) ET = H.(S?) = his). 


We next find an upper bound for | T |. We have T = —log,8(Z) — 1(Z| X). 
Now 8(b) 2 éand ts,;(a, b) 2 8 for all a, b. Thus, since 0 S —loge8(Z) S —loged 
and0 s 1( Z| X) S —logsd, we have 


(6) |T | S&S —log,d. 


Using (5) and (6), we find a bound for the moment generating function of 
the variable T7; = T — h(é) + \, where d is a positive number. From (5), (6) 
we obtain E(T;) 2 A, | 7T:| S&S A —logsd = Q = QA, 4). For t S 0, we have 
e514 tT, + [(tQ)*/2\e'"*, so that o(t) = Ee s 1 + Mt + [(tQ)*/2}e'"*. 
From now on, we restrict \, 6 to the set 


(9) /Q S log(4/3). 
With this restriction, and & = —d/Q’, we obtain 
(10) o(to) S 1 — (0°/3Q*) = pr = pila, 8). 


Now ¢ is the conditional moment generating function of J, — h(é) + 4, 


given Xa, = 2*, Za, = 2*. It follows that E(exp to(>.%-: (J, — h(d) + d))) 
S pi, so that 


(11) P{J(Xn, Ze) S N(h(8) — d)} S pt (8, 7). 


Now h(8) — H,(So) as 6 — 0. Choose &» sufficiently small so that 
h(bo) > H.(Se) =—- ¢@ + ¥ = H, a Xr 


and h(é) — Hy — y S —loge & log(4/3), and set > = h(d)) — Hy — y. From 
(11) we obtain 


(12) P{J(Xx, Zy) S N(H, + )} Sp” = p'(o — y) 


where p = pi(Xo, 49) < 1 and depends only on o — y and the modulus of con- 
tinuity of the function tr. Inequality (12) is the first, and most difficult, step 
in our proof. 

Now 


P\Zy = 0| Xn = ul =D PUYy = 0 | Xv = ul P{Zy = | Vx = 0} = 8m(u,0), 
where s is the U’, V channel defined in (3) and s; is the V, V channel which 


sends inputs Yy into outputs Zx, with 6 = &. We now apply a fundamental 
inequality of Shannon [6], which asserts the existence, for any message set D 
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with | D| < 2°”', of a pure U, V code ¢ = (8, 8), whose average error prob- 
ability, on channel gs; , is at most P{J(Xy , Zy) S N(H, + y)} + 27%”. Thus, 
using (12), we obtain r(ss;,c) S 2p2 , where pp = mino<y<s max(2~”, p(o — y)) 
< 1. Now r(8s;,c) = 2(s, c*), where c* = (8; , 882). 

We have now proved the 

Lemma. There is a constant p. < 1 such that, for | D\| = (2""") and any prob- 
ability distribution q on the set F of U, V jamming strategies, there is a D, U, V 
code c* for which 

Di a(f)a(sy, c*) S 2or. 

We now consider the two-person zero sum game in which the pure strategies 
for Player I are the U, V jamming strategies f, the pure strategies for Player II 
are the pure D, U, V codes c, and the payoff to Player I for f, c is r(s;, c), 
the average error probability for code c on the channel sf determined by the 
jamming strategy f. The lemma asserts that, for any given mixed strategy q of 
Player I, there is a corresponding strategy for Player II which makes the payoff 
to I at most 2p: . The minimax theorem then asserts the existence of a mixed 
strategy for Player II, i.e., a probability distribution k over the set C of pure 
D, U, V codes, for which Zk(c)x(s;,c) = w(sy,k) S 2p2 for all jamming strate- 
gies f, ie., r(s,k) < 2p2 for all s ¢ S;. Thus 


ofiNn ¥ \ o r 
3((2 , |, N, So) s 2p2 — 0 as N —> BD, 


H, is an admissible rate of type 3, and the proof of Theorem 1 is complete for 


the case of finite Sp . 

The restriction to finite Sp was made only to avoid irrelevant details, e.g., 
measurability of jamming strategies. This restriction can now easily be removed 
by approximation. For an arbitrary Sp , let 7 be any set which contains Sp and 
which is the convex hull of a finite set. Clearly R;(.S.) 2 R;(7T), and we have 
shown that R,(7) 2 H(T). Thus R;(S.) 2 supr H(T). It is not difficult to 
show that supr H(7T) = H( S$), completing the proof. 


3. Converse half of principal result. 

TueoreM 2. For any closed Sp, Ri( So) S H(So). 

Proor. It was proved in [1] that r:(So) S H(S,). The present proof is a minor 
modification of the earlier one. Again, we shall use 

Fano’s inequality {2}, [7|. For any two random variables W, W’, 


EI(W\W’) s —(g logeg + (1 — g) loge(1 — g)] + g loge(G — 1), 


where g = Pri{W # W’} and G is the number of values of W. 

We consider a random (D, U, V) code k, take any U, V channel s ¢ S; , and 
suppose that a message is selected from D with a uniform distribution and trans- 
mitted over s using k. We denote by W, X, Y, W’ the resulting message, U, V 
input, U, V output, and estimated message respectively. Let g = x(s, k) = 
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Pri W’ # W}. Let us denote by Z the pure code selected, so that Z is independent 
of W and has distribution k. Then 
EJ(X, Y|Z) = EJ(W, W’ |Z) 
EI(W) — El(W |W’, Z) 
= EI(W) — EI(W |W’) 
= (1 — g)log:| D| — 1, 
where the last inequality is obtained from Fano’s inequality. Also 
EJ(X,Y |Z) = EI(Y |Z) — EI(Y | X,Z) = EIl(Y | Z) — El(Y | X) 
s EI(Y) — EI(Y |X) = EJ(X, Y). 
Combining this inequality with (13) yields 
(14) EJ(X, Y) 2 (1 — g)log:| D| — 1, 
1.€., 
(15) g = x(s,k) 2 1 — [EJ(X, Y) + 1/log,| D)). 


Since the distribution of X is independent of s, we maximize (15) over s ¢ S;, 
then minimize over k, to obtain 


(16) mi(| D|, N, So) = 1 — [H(S,) + 1)/floge! D |). 


But, as shown in [1], H(S,;) = NH(S»), so that 
(17) 3((2"*], N, So) = 1 — [NH(So) + 1)/[log:(2””)]. 


Thus if R is an admissible rate of type 1, limy.. [NH(So) + 1)/floge(2””)} 2 1, 
i.e., R S H( So). This completes the proof. 

We summarize our results in 

THeoreM 3. For any So, 


Rx(So) = Re(So) = R,(S2) = 7,(89), 


where S? is the convex hull of Sy. Also Ri( So) = r(So) = H(So). 

Proor. That r:(S89) = H(So) was shown in [1]. Since r4(S9) S R,( So) and, 
from Theorem 2, Ri(S,) S H(So), we have R,(So) = m(So) = H(S»). The 
chain of inequalities 


H(S3) S R:(So) S RS.) = RSs) S RiSc) S H(S3) 


completes the proof of Theorem 3. 

An example and an open question. We have associated with a set So of A, B 
channels six capacities, according as (a) we face (1) the same unknown channel 
in Sp each period, (2) an unknown channel varying arbitrarily in S, from period 
to period, or (3) an unknown channel in Sp, selected each period by a jammer 
with knowledge of previous inputs and outputs, and (b) we are restricted to 
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pure codes or are allowed to use random codes. Of these six numbers, we have 
evaluated four: r;(So) and R,( S89), i = 1, 2, 3. 

The evaluation of r2(So), rs(So) remains unsolved. We conclude with an 
example in which r2(S9) = r3(So) = 0, while R2( So) = R;(So) = 4. This ex- 
ample illustrates that, against an unknown arbitrarily varying channel, or against 
a jammer, random codes are a real improvement over pure codes. 

In our example, Sp consists of two noiseless channels, labeled 0 and 1. Each 
channel has two inputs, 0 and 1, and three outputs, 0, 1, and 2. Channel 7 trans- 
mits input 7 perfectly, but changes the other input 1 — 7 into 2: 


Input Channel 0 output Channel 1 output 
0 0 2 
1 2 1 
We shall prove that, for any number N and any pure D, U’, V code c = (8; , 8), 
there is a channel s ¢ S; for which 


(18) m(s,c) 2 (G — 1)/2G, 


where G = | D| = number of messages in D. 

Thus no set with two or more messages can be transmitted by a pure code 
with average error probability less than } over every sequence of channels in 
So, no matter how many transmission periods are allowed. It follows that 
ro( So) = 0, and a fortiori r3(So) = 0. On the other hand, our formula 


R;( So) = max min E,,J «.(X, Y) 


a 8¢85 


yields R3(.So) = 4, with a = (4, 4) as the maximizing input distribution and the 
channel s with matrix 


40 4) 
03 4) 


the midpoint of the channels in Sp , as the minimizing channel in SS . 

To verify (18), let N be any positive integer, let D be any message set with 
| D| = G elements, and let c = (8, , 82) be any pure D, U, V code. Let x4, denote 
the nth input specified by ¢c for transmitting message d, and let zz denote the 
vector whose coordinates are ra, , nm = 1, 2,---, N: za is the vector in U for 
which s;(d, za) = 1. Let us denote by s(d) that U, V channel in S, which trans- 
mits zq perfectly: s(d) has channel number z,, as its nth coordinate. We note 
that the output » corresponding to any input u and any U’, V channel s ¢ S, has 
for its nth coordinate the common vaue of the nth coordinate of u and the number 
of the nth channel of s, if these numbers agree, and has 2 if they do not. Thus, 
denoting this output vector by v(u, s), we have 


v(ta, 8(d’)) = v(ra , 8(d)). 


The probability p(d, d’) of an error in transmitting message d over channel 
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s(d’) is 0 if v(zz, s(d’)) is decoded as d, and 1 otherwise. If d’ # d, the vector 
v(atq, 8(d’)) = v(zy , 8(d)) cannot be decoded as both d and d’, so that p(d, d’) 
+ p(d',d) 2 1 for d’ # d. Summing this inequality over all pairs d, d’ with 
d’ # d yields 


2G) x(s(d),c) 2 G(G — 1), 


so that, for some d, x(s(d),c) 2 (G — 1),/2G, and (18) is verified. 
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ON THE ESTIMATION OF THE SPECTRUM OF A STATIONARY 
STOCHASTIC PROCESS 


By K. R. PARTHASARATHY 
Indian Statistical Institute, Calcutta 


1. Introduction. Recently many authors have been interested in the problem 
of estimating the spectral density function of a weakly stationary process. Under 
assumptions of linearity of the process and existence of derivatives of the spectral 
density, U. Grenander and M. Rosenblatt [1] have investigated the asymptotic 
behaviour of various estimates. E. Parzen [2] has investigated the asymptotic 
behaviour of different ty, ~s of errors of the estimates under assumptions of 
fourth order stationarity and exponential or algebraic decrease of the covariance 
sequence. 

In this paper, the problem of estimating the spectral distribution as well as 
the spectral density (if it exists) of a weakly stationary process is solved under 
the sole assumption that the sample covariances converge almost surely and in 
mean to the true covariances. The relevance of Bochner’s work on Fourier 
analysis [3], in obtaining more exact expressions for the bias of estimates, is 
pointed out. The existence of estimates which converge uniformly strongly to 
the spectral density of the process is proved under the assumption that the 
density has an absolutely convergent Fourier series. It should be added that 
only questions of consistency are discussed here and, no attempt is made to 
derive the asymptotic distribution of the estimates. 


2. Estimates of the Spectral Distribution Function. 
Definitions: We suppose that x; , z2 , --* Zy are observations at N consecutive 
time points on a discrete weakly stationary stochastic process 


ooo, —1.0,1,---), 


with the weil-known spectral representation (cf. {1}) 


| e” dZ(n); Ex, = 0; p,p = p_, = Ex, ty, = | e” dF(X), 


where Z(A) is an orthogonal stochastic set function (cf. [1]) and F(A) is a mono- 
tonic right continuous function in [—7z, 7m]. It is easily seen that 


(2.2) pb = pi. = (XX + +++ + ry vitn) (N— |» ) 
is an unbiased estimate of p,. We shall consider the following estimate of the 
spectral distribution: 


+R(N) 


(2.3) Py(k) = 1/2" D> ayn: (p/ikye™, 


k=—R(N) 
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where the term corresponding to k = 0 is ao.y(A + 2), and the ay are con- 
stants chosen such that the following conditions are satisfied: 


1) a:.w— las N — ~ for each fixed k, 
2) Gi.w = Gin, 
3) Py(X) is a distribution function. 


As is known from previous work [1], [2], it is advantageous to choose R(N) = 
o(N ). We shall now state, without proof, a theorem concerning the convergence 
of the estimates Py()). 

THEOREM 2.1: Jf {z,j is a weakly stationary process with a spectral distribution 
F(\), and the sample covariances converge almost surely to the true convariances, 
then’ P(Py — F| = 1. 1 f, however, F(\) is continuous, then 


P| sup Py(d) — FO) |—+ Oas N > ~] = 1. 
|Al g2r 
If, further, the sample covariances converge in mean to the true covariances, then 


lim E sup | Py(A) — F(A) | = 0. 
Noo |Al g2r 

The first part of the theorem is contained in Doob [4]; the second part follows 
by an application of a theorem of Pdlya to the effect that the weak convergence 
of a sequence of distributions to a continuous distribution implies uniform con- 
vergence; the last part follows from an easy computation. 

The choice of the constants a, : Our main object is to make a suitable choice 
of the constants a,,.7 , and to examine the order of the bias, convergence, etc., 
of the estimates thus obtained. The method we use for this purpose is simply a 
Fourier analysis. It is based almost entirely on the work of Bochner [3]. We now 
state the main result of Bochner, in the form required here. 

Let f(z) be a continuous periodic function with period 2 and let 


sin t/2\’ ** /sin t/2\” 
R(t) w({=4 ) M(r) = | (a fa) 
( t/2 ’ a(r) 2 t/2 * 


ay _ (KOT 
RAD) = “Gy ° 


St (z) = | f(x + t) R/r K,(Rt/r) dt. 


THeoreM (Bocuner): For any continuous periodic function, f(x), 


Sk(x) — f(z) | = Olw(4r/R) + 4}, 


w(x) = max | f(z) — f(z) |. 


|2y—29\<z 
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] ™ 


N 2 
* 1t(+u) 
ta) « on | > 2, = 
Suir) 2nN JLo |<” ‘ 


. a 
(2.5) Fy Q) = [ fed) ar. 


Then it is possible to write Fx(A) as given in (2.3) and to show that all the 
required conditions are satisfied. Thus, by Theorem 2.1, F 7 iS & consistent 
estimator of F under very mild conditions. We now statea theorem concerning 
the bias of Fy as an estimate of the spectral distribution. 


THEOREM 2.2. For a weakly stationary process, |x,}, with a continuous spectral 
distribution, F , we have 


sup | EFx() — F(A) | =Ofw(4r/R) + 47 + R/Nw(R™)), 


|A/<r 


where w(x) = max |G(dA.) — Gir) /, 
|Ay—Agi <# 


and 
G(A) = F(A) — [(A + 3) /2x]p0. 


Since the above is an easy consequence of Bochner’s theorem, the proof is 
omitted. 


Corotuary: If F(A) satisfies Lipschitz’s condition, i.¢. 
| FQ) — Fs) | <e¢| a — Ae], 
where c is a constant, then w(x) < cx for any x > 0, and hence 
sup | EFx(A) — F(A) | = Ofr/R + 47 + ((R)*/N)}. 


Thus, in order to obtain an asymptotically unbiased and consistent estimator 
of F, we have only to choose r and R such that r-— «, R - «,r/R —Oand 
R/N —0as N — © in Fx(A). 

For Gaussian processes the following theorem can be easily proved. 


THEOREM 2.3. For a Gaussian process with a square integrable spectral density 
we have 


E sup | Fx(A) — F(A) | = Of(log R/(N)') + w(4r/R) + 4°). 
\Al s2 

3. Convergence of the Spectral Density. In this section we shal! discuss the 
choice of r and R so that the estimate fy(A) given in (2.4) converges (almost 
surely) uniformly to the spectral density of the process. Our choice will be such 
that r and R are not only functions of N but of the observations themselves. 
It should be noted that, even if r and R depend on the observations, Theorem 
2.1 remains valid provided that r and R diverge to infinity with probability one. 
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We require the following 


Lemma 3.1. For any weakly stationary process x, , if Y xi/N is convergent 
with probability one as N — ~, then, for « > 0, 


Pisup sup |f&| < ©] = 1, 
N Osgksni-* 
where p, is as in (2.2). 
PROOF: 


t=1 


N—k N—* x 
| fel = (| Do rere | )/(N — k) S IA/(N — kD 2) (De), 
1 k+l 


so that 


N N 
(3.1) sup | & | S$ 1/(N — NY") Dat < ((D 2i)/N(1 — 2'*)| 
Oskgn'i-* i 1 
for N 2 2. Since by assumption (>-Y z?)/N converges, the expression on the 
right side of 3.1 is bounded with probability one. This completes the proof. 
Our estimate of the spectral density function is 


fx(\) =1/2eN i | > 2.<"™ I R/r K, (Ru/r) du, 


which can also be rewritten as 
+R 


fr(d) = 1/2e > ¢'(rm/R)(1 — | m\/N)pne™, 


e(t) = [ &* K(x) ar. 


We now prove the following 

TuHroreM 3.1. Let {z,| be a weakly stationary process, with spectral density 
function {(d) and covariance sequence { p,|, which has the property that >-*% | p, | is 
convergent. Suppose, further, that the sample variance and covariances converge 
almost surely, and in the L, mean, to the true variance and covariances respectively. 
Then there exist R(N, x, 22 --- I~) and r(N, 2, 22, +++ Zw) such that 


sup | f(A) — f(A) | 0 
lAl<r 


almost surely as N — @, 
PRoor: 


+R 
fr() = 1/2e >> ¢'(rm/R) (1 — (|m\/N))(bu — pale” 
RR 


3.3) 
+R 


+ 1/2 >> ¢’(rm/R)(1 — (| m\/N))pme’™ = Si + Sz, say. 
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For S; we have 


+R 


(3.4) 


if R < [N*“‘], « > 0, 6 > O. Since for each m, | jm — pm | — 0 with probability 


one, and by Lemma 3.1, aos [( | Bm — pm | )/m**”) is bounded, we get by 
Toeplitz’s lemma [5] the following: 


(vi~*] 


(3.5) Pilim >> [(| fm — pm| )/m'**?)-1/m*? = 0} = 1. 
1 


Noo 


We choose R such that R > ~, with probability one, R < [N’™‘], and 
[wi~*] > -1/4+8) 
(3.6) R=o| > (( | Bm — pm | )/m'**) | ’ 
i 
Then 
P{sup|S,:|-0 as N-> @] = 1. 
x 


Turning to S:, we have 
t ’ 


(3.7) S(r) — f() = [. [fw(\ + t) — f(A) |R/r K, (Rt/r) dt, 


where 


+N 
(3.8) fu(\) = 1/2e > (1 — (| m|/N))pme™ 
—N 


is the Nth Fejer mean of f(A). Since >>*% | p, | is convergent, f(A) is bounded 
and continuous. Since f(A) is symmetric in A, f(#) = f(—2). Hence, by Fejer’s 
theorem, 

(3.9) lim sup | fw(A) — f(A) | = 0. 


Now Al gF 


From (3.7) we have 


sup | S.(A) — f(A) | Ss sup | fu(y +t) —fAtt) | 
x a a) 
(3.10) 


.(R/r)K,(Rt/r) dt + sup | [f(a + t) — f(A)] (R/r) K,(Rt/r) dt}. 
x =) 


Since {7% (R/r)K,(Rt/r) dt = 1, the first term on the right of (3.10) goes to 
zero as N — . If we choose r such that r— © and (r/R) ~0as N — =, it 


is easily seen from Bochner’s theorem, that the second term also goes to zero 
with probability one. 
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We remark that, if we choose rR = o(N) and r = o(R), the theorems of U. 
Grenander and M. Rosenblatt [1] on the consistency of the spectral estimates 
for linear processes become applicable. 

Finally, let us consider the behaviour of the periodogram of a stationary 
Gaussian process. It is well-known that the periodogram does not converge to 
any random variable as the sample size increases to infinity. However, the fol- 
lowing theorem holds. 

TueoremM 3.2. For a stationary Gaussian process with a spectral density f(d) 
satisfying Lipschitz’s condition, 


2 2 


N N 
> x, cost d >, 2 sin td 


> . 1 : t=1 . -t be yr sai 
PS  Winlen 1 22S Binks 7’) 7) 


The proof follows from the analyses of W. Feller [6] and G. Maruyama [7]. 
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EXPECTATIONS OF FUNCTIONALS ON A STOCHASTIC PROCESS 


By Epwarp NELSON AND DALE VARBERG 
University of North Dakota and Hamline University 


1. Introduction. Let {x(t),0 < t < «} be a separable stochastic process with 
stationary, independent increments, for which z(0) = 0 and whose characteristic 
function is 


Ej} os caer. a> 0. 


One may verify that, if0 S$ 4 <&<--- <t{ < © and m; is an integer, 


P\x(te) = me, 2(tea) = mer, +++, 2(h) = m} 


= Grrr Seng, | alts — thal-++ Im g—mla(te — tl n,{(ats)], 


where J,(z) = « "-J,(iz), J,(2) being the Bessel function of the first kind. 
By separability the sample functions, z(t), of this process are simple functions 
which assume integral values on intervals. They may be interpreted as the 
monetary gain in coin tossing at random times. To be more precise, z(t) is the 
sum of a random number, N(t), of independent, identically distributed Bernoulli 
variables with distribution Piz = —1} = P{z = 1} = 4, where N(t) is the 
sample function of a Poisson process ({1], page 398). This process is important 
in the theory of collective risk and has been studied by Tacklind [2]. Certain 
similarities between it and the Wiener process led us to attempt to find the ex- 
pected value of some functionals on this process using a method developed by 
Kac ([3], Section 3). The principal result of this paper is the following theorem. 
Tueorem. Let 


o f t 
(1.1) v, = [ e"Eexp| -u [ V(2(r)) ar |, 21) = n} dt 
0 0 ) 

where V is non negative. Then WV, satisfies the difference system 

. Wnar — (2/a)(8 + a + UV,)Vn + Van = —(2/a)b,. 0, 
(1.2) 

vY,—-0 as n->+%, 

where V,, is the value of the function V when x = n. (Note: For any function 
K, E{K(zx), x(t) = n} means E{K(x)x(z)} where x(x) = 1 if x(t) = n 
and x(z) = 0 otherwise. ) 

In Section 2 we outline the proof of the theorem and in Section 3 we give some 
illustrative examples. 


2. Proof of Theorem. In order that we may easily interchange the order of 
certain limits, we assume first that V is bounded. This restriction will be removed 
later in the proof. Following the method and notation of Kac we define in- 
ductively 
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(21)  Q(n,t) = [ > Vim)“, _afa(t — )]Qs-1(m, r) dr, 


es 


where Qo(n, t) = e “'J,,(at). This gives 


Q.(n,t) = [ff SS Weee) «<> ead 


m= mya 
‘exp [—a(t — tm) — alte — tea) — +++ — (ts — 11) — ari) 


“Tn—myla(t ms Te) ) + Long —ony_slO( Te ee Tr-1)] er I ng—m,(a( 12 _ 71) Im, (ari) dr, ---dt, 


= [[-- : f E\ V[x(71)|Viz(t)] «++ Viz(s,)), c(t) = n\ dr, «++ dry 


Ei [ [- sa [Vieeoivtetea) c7+ Vitnlan --: dn ot = nh. 


Thus 

( t k 
(22) Q,(n,t) = ECL Vix(+)] dr) ,2(t) = n} <p tM'Piz(t) = n}, 
where M is an upper bound for V. We define 


(2.3) Q(n, t,u) = D0 (—1)*w'Q,(n, t). 
k=O 


Using (2.2), we obtain 


(2.4) Q(n,t,u) = B{exp| -u [ V(2(r)) ar| , z(t) = a 


} 
We see immediately that 
(2.5) Q(n, t,u) S Pix(t) = n} = €T, (at). 
From (2.1) and (2.3), we find 
Q(n, t, uw) — Qo(n, t) 


94 eo t 

(2.6) = —U > [ Vim)e* "1, _wla(t — r)|Q(m, r, u) dr. 
m=——a /0 

Now let, [see (1.1)], ¥. = ft e “Q(n, t, u) dt, and take the Laplace Transform 

of both sides in (2.6). This gives (see [4] page 131, Formula 6) 


(2.7) Vn = < 7 : > A+. Y as 


c « 


where c = (s’ + 2as)' and A = a/(s + a+). From (2.7) it can be shown 
that, forn ~ 0, 


Wasi + Van = [(A + A") + (u/c)V,(A™ — A),, 
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with a similar formula for n = 0. The difference system (1.2) now follows easily, 
the boundary conditions coming from the estimate in (2.5). 

Now suppose that V(x) is unbounded and define Vy(x) as V(z) if V(z) S M 
and 0 otherwise. We have then from (1.2) the difference equation 


2 , 
(2.8) Guat > a (s +a+ulb un)Vuen + Vu. = ~* bx, 


where 


Vu. = I 'E\exp| —u [ Vul(ax(r)) ar|, 20 = a dt. 


By bounded convergence limy.. Vu,, = V, . Thus, taking limits on both sides 
of (2.8), we obtain the desired result. 


3. Examples. 
(a) Let V(z2) = Oif —p < x < qand 1 otherwise where p and gq are positive 
integers. We define ve = lim,.. ¥, and note that 


y* = [ e“P{—p < 2(r) < qfor0 Sr St, x(t) = n} dt. 
0 


We observe that, for —p < n < q, V% satisfies the difference equation in (1.2) 
corresponding to this V; hence, 


0 
D,A" + D,A™ 
E, A" + E,A ” 
0 


(3.1) 


where D, , D. , FE; , and Z, are suitable constants, and 


A = a/|s +a + (8° + 2as)'). 


+2 a 
¥-%> vw! -[ e'P{—p < 2(r) < qfor0 Sr St} dt. 
0 


i=. 


Using (3.1), we obtain 
(3.2) YW = 1/s-(1 — A”)(1 — A*)/(1 — A”). 


In the special case where p = ~, (3.2) is easily inverted giving 


> {sup z(r) < g} = 1-— af e “"(1,(ar)/r) dr, 
0 


Osrst 


a result obtained by Baxter and Donsker in ([5], Section 4). 





EXPECTATIONS OF FUNCTIONALS 
(b) Let V(x) = 2’. The difference equation in (1.2) then becomes 


2 2 2 
Vaui — a (s + a + un yv,, + , = a 6,0. 


We define ¥(¢) = >-%_ 2 ew, = >°%__. ¥, cos 2nt. Then ¥(£) satisfies the 
differential system 


wv" (=) — [(4/u)(s + a) — (4a/u) cos 2&)¥(¢) = —4/u, 
(3.3) 


W'(0) = W (4/2) = 0. 
To solve (3.3) we consider the differential equation 
(3.4) w"(&) + [u — (4/u)(s + a) + (4a/u) cos 2¢}¥(¢) = 0 


with the same boundary conditions as in (3.3). The Green’s function G(£, 7) 
for (3.4) is given by 


(3.5) G(E, 0) = Do dn(E)be(m) /me 


where », and ¢(£) are the eigenvalues and normalized eigenfunctions of (3.4) 
respectively. By Mercer’s Theorem ((6], p. 138) the convergence is uniform in 
£ and 7», the y,’s all being positive (at least for large s). The solution for ¥(£) 
in (3.3) is thus given by 


v/2 oe v/2 
(3.6) we) = (4/u) [GCE n) dn = (4/u) YP [ oucn) de 


On the other hand, if we let \ = uw — (4/u)(s + a), (3.4) is seen to be Mathieu’s 
equation. Using the notation of ((4], p. 46), we find that 


oy (&) aes by Cem () = bh _ Ax 2n cos 2né 
0 


where hy = (2/x)' ifk = Oand h = 2, (x) ifk = 0. 
Upon substituting in (3.6), we obtain 


W(t) = 2x/u > b? Ax o/ us > Axt 2n COS 2né 
k=O n=) 


ax +a 
= (4/u) >, Asso/ ss x Ax an COS 2nk 
k=O 


a=——et 
where Axo, = Ax,» for n < 0. After interchanging the order of summation, 


we have 


v(t) = (4/u) > > } Aco Aron [he + (4/u)(s + a)}} cos 2nét. 


n=—w k=O 
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By the uniqueness of the Fourier coefficients, it follows that 


= (4/u) du Axo Arton/[\e + (4/u)(8 + a)). 
=0 


Inverting with respect to s, we obtain 


B{exp| -u [ ie(s)P dr], 2(0) i nb= 2, AnoAn, an e eter 
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ERGODOCITY OF QUEUES IN SERIES' 


By J. Sacks 
Columbia University 

1. Introduction. We are interested here in determining when a queueing system 
consisting of several queues in series is ergodic. To define what is meant by queues 
in series let us consider the case where there are two servers. The definition of 
two queues in series is given as follows: The nth individual arriving to the queue- 
ing system enters, at his time of arrival, a queue (queue 1) in front of the first 
server. He waits there until all the individuals in front of him have been served 
in the first server at which time he begins his service. Upon completion of his 
service the nth individual enters a queue (queue 2) in front of the second server, 
waits there until all the individuals in front of him have completed their service 
in the second server, and at that time he begins his own service. Queue 1 and 
queue 2 are now said to be in series. Putting matters more concisely we can say 
that two queues are in series if the output of the first queue is the input of the 
second queue. 

To define what we mean by the ergodicity of this queueing system, let W, 
be the waiting time in queue | of the nth individual and let W, denote his wait- 
ing time in queue 2. The queueing system is said to be ergodic if the joint dis- 
tribution of (W,, Ws) converges, asn > «, toa probability distribution. 
Assuming existence of first moments for the two service time random variables 
and the interarrival random variable (the sequence of interarrival time random 
variables is assumed to be a sequence of independent and identically distributed 
random variables and the same is assumed for each of the two sequences of 
service time random variables) we are able, in Theorems 1 and 2 below, to char- 
acterize when {(W, , W2)} has a limiting probability distribution. 

The method we use to characterize ergodicity is first to show (Lemma 2 below) 
that the distribution function of (W,, Ws) converges to a limit asn —> ~ 
though the limit may not be a probability distribution function. This is the easy 
part of the argument. The second part of the argument is to show under ap- 
propriate conditions (see Theorem 1) that (W, , W*) is bounded in probability 
so that as n — © no probability escapes to infinity and this yields the fact that 
the limit shown to exist in the first part is a bonafide probability distribution. 
The last part of the characterization lies in showing that when the conditions 
for Theorem 1 are not satisfied then either W, or W goes to + in probability 
(Theorem 2). This outline of the argument is quite the same as the outline of 
the argument used by Kiefer and Wolfowitz [3] for the queueing system they 
consider. The details of the first part are strongly related to those in [3]. The 
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means whereby the second and third parts of the argument are accomplished 
depends on knowledge of the behavior of the maximum of partial sums of inde- 
pendent and identically distributed random variables. This was first utilized by 
Lindley [4] in his treatment of the one server queueing system. 

Burke [2], Reich [5] and others (see [2] and [5] for other references) have con- 
sidered queues in series when the service time random variables and interarrival 
random variables are exponentially distributed. Akaike [1] considers a problem 
of ergodic behavior of queues in series related to the one we treat here. Akaike 
assumes that all the random variables in sight take on values which are integral 
multiples of some fixed positive number so that the waiting time process is a 
discrete process (our random variables have no such restriction). Furthermore 
he assumes that the nth customer cannot enter the second queue before customer 
n + | arrives at the first queue; thus if customer n finishes service at server | 
before n + 1 arrives, customer n must wait until customer n + 1 arrives before 
entering the second queue. Our assumptions are that the nth customer enters the 
second queue immediately after finishing service at server 1. 

We have only talked about the case of two servers which gives rise to two 
queues in series. It is simple to see how to define s queues in series when there 
are s servers—this giving rise to s different waiting times to worry about. All our 
previous remarks for the two-server case are valid for the s-server case. We have 
separated the treatment of the two-server case from that of the s-server case in 
order to avoid confusing notational problems with the principal ideas. 


2. The Two-Server Case. In this section we will consider the case where there 
are two queues in series, the output of the first queue being the input to the 
second queue. 

Let 7, be the time at which the nth individual enters the system. Let R, de- 
note the service time in the first server of individual n and let p, be the service 
time of individual n in the second server. Let gai: = tai: — Ta. We assume 
that each of the three sequences {R, ;n 2 1}, {gn;m 2 2}, {pn;n = lj isa 
sequence of independent and identically distributed random variables and that 
the three sequences are mutually independent. Furthermore we assume that the 
R,’s, gn’8, and p,’s are non-negative random variables and that ER; < o, 
Eq, < ~, Ep < @. 

Let W,, be the waiting time in the first queue of the nth person and let Ws 
be the waiting time in the second queue of the nth person. The waiting time, 
of course, is the time between arrival at the queue and the beginning of service. 
To establish a relationship between (W, , W,) and (Was, W-.,) observe that 
the nth individual leaves the first server (enters the second queue) at time 
t. + W, + R, and leaves the second server at time r,n + W, + Ra + We + pn, 
while the (n + 1)th individual arrives at the first queue at time 7,4; and at 
the second queue at time tay: + Wasi + Ray. Thus the (n + 1)th person 
waits 0 time in the first queue if r, + W, + R, S tas ie., if W, + Rs — 
9nui1 S 0, and waits W, + R, — gas: if the last quantity is positive. Stated 
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more concisely 


(2.1) Waar = max (0, W, + Ra — gasl- 

Similarly, 

(2.2) W's. = max (0, WE + pn — Razr + Ra — goss + Wa — Wosile 
(2.1) and (2.2) are valid for alln = 1 with W,; = Wi = 0. 

Let Z, = (W,, Ws). {Za} is not a Markov process but putting Y, = (Z,, Ra) 
provides us with a sequence {| Y,} which is a Markov process with stationary 
transition probabilities. These considerations will enable us to prove Lemma 2 
below which is the first step in characterizing when {Z,} is an ergodic process. 

Let t = (4, th), t = (m1, 2%) with 4h, b, %, 2 all nonnegative numbers. 

Lemma 1: P{Z, St|Z,=2,R, =r} S$ PiZ. st|Z, = 0, R, = rj forall 
ie ee 

Proor: Fix a point w in the sample space of R,., --- , Ra, go, °°" 5 Qn5Piy*** 
Pn—1 , and let 
Wilw,z) =a, Wilw,2) = 2 
W ;(w, x) max [0, W,4(w, z) + Ryalw) — g;(w)] 

Wj (w, z) = max (0, Wy-ilw, z) + pjpalw) — Rw) + Ryalw) — 9j(w) 
+ Wy1(w, zr) — W;(w, z)}, 
for 2 Sj S n. It is clear that W;j(w,0) S W,(w, z) for each 7. Observing that 
Rj-1(w) + Wj1(w, z) — W;(w, x) 
= Ri4(w) + Wj1(, z) — max [0, Wy_1(e, z) + Rylw) — g;(w)] 


= min (Wj_:(w, z) + Rjs(w), g;(w)], 
we have 
Wj (w, x) = max (0, W3_1(w, z) + pjalw) — Rj(w) — g,(w) 
+ min [W, (a, z) + R, i(w), g;(w)]] 
and it follows easily that W3(w, 0) < W7(w, z) for all 2 < j < n. Lemma 1 
is now seen to be true. 

Lemma 2: P{Z, S t| Z,; = 0} — F(t) asn— @& where F is a two-dimensional 
distribution function whose variation over two-dimensional space may be less than 
one 1.e., F may not be a probability distribution function. 

Proor: Let H(z, r) = P{Z. S z, R2 S r| Z, = O}. Then 

P\Zaur S t|Z, = 0} 


(2.3) 


a [ Piz Sti = ss, %: = 0) di(s,2). 


Since {Y.} (Y. = (Z,,R,)) is a stationary Markov process and because of 





582 


Lemma 1 
PiZ.un St|Z = 


(2.4) 
=P\Z,s¢t|%=2,R =r} & PiZ,. St|Z, = 0, R, 


Let H* be the distribution function of R, and, therefore, of R,. Then, 
(2.4) in (2.3), we have 


PiZa S t\Zi = 0) 5 f P{Z, <t|Z: = 0, Ri = 1} aH(z,r) 


" | Piz, < t|Z, = 0,R, = r} dH*(r) = P{Z S t|Z, = 0}. 


Thus P{Z, Ss t| Z, = 0} is a monotone sequence and therefore converges to a 
limit which we call F(t). The above-mentioned properties of F are easily deduced. 
Tueorem 1: Jf Eg, > max (ER,, Ep) then F (defined in Lemma 2) is a 
probability distribution. 
Proor: Because of Lemma 2 we need only show that, under the conditions 
stated here, {Z,} is bounded in probability, i.e., for all n 


(2.5) P(Z, St|Z, = 0} 2 1 — a(t) 


where 7(t) > 0as{t— ©. If we can prove that {W,} and {W%} are each bounded 
in probability then (2.5) will be established. Lindley [4] has shown that {W,,} is 
bounded in probability so it remains only to consider {W%}. 
For j = 1, 2, --> let 
3 


(2.6) S; = Zz (Ri — gia) 


t=—1 


and let S, = 0. By iterating (2.1) and using the fact that W, = 0 we have 
(2.7) Waa = max [S,, _ S;). 


Osisn 


Hence 


(2.8) Ra— gnirt Wa = Ra — gaat max [S,.,— S;)= max [S, — S)}. 
Osisn-1 Osign-l 


For k = 0, let 


(2.9) B, = max (—S,). 
Osisk 


Then (2.7), (2.8), and (2.9) yield 

(2.10) Ry — gaur + Wa — Wan = Bri — Ba. 
Using (2.10) in (2.2) gives 

(2.11) Wess = max (0, WS + px — Rai + Bra — Bal. 
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Put 


k 
(2.12) T= Do (o— Rigs) for k21 and T=0. 
tal 


Iterating (2.11) (use Wi = 0) we have 


(2.13) Wea. = max (T, — T, + B, — B,| 
Osksn 


= max {T. - T; + max (—8S;) ~ B,]) 
Osksn Osisk 


max [T. - T; = S; = B,}. 
Osisksn 


Let « 2 0 (we shall specify « later). Then 


Wea = max [T, — T, — (n—k)e + (n — ke — S; — B,] 


(2.14) S max [7,—-T7, — (n—k)e) + max [(n — k)e — S; — B,] 
Osisgksn Osiatsn 


max [T, — Ti — (n — k)e] + max [(n — j)e — S; — B,]. 
Osktsn Ossian" 


Define & = pi — Rigs — ¢ and let U, = 5-4, &;. Thus U, is the kth partial 
sum of independent and identically distributed random variables. Then 


(2.15) max [T, — T, — (n — k)el) = max (U,— Ui) = A, (say) 
Osgksn Osksgn 


has the same distribution as maxo<:<, U;. If « is such that 
(2.16) Ep, — ER, — « <0 


then, it is well known, maxo<:<, U; — a finite random variable with probability 
one (w.p.1) which implies that {A,} is bounded in probability. 
Observe that B, = maxo<;<. (—S;) 2 —S, . Hence 


(2.17) max [(n—j)e— S;— B,] S max [S, — 8;+(n—j)e) =C, (say). 
Osisn Ossian" 


Let Vj = Doin (Ri — gies + ©). Then V;, is the jth partial sum of independent 
and identically distributed random variables. Thus, as before, C, has the same 
distribution as maxo<;<, V; and, if « is such that 


(2.18) ER, — Eg: +e«< 0, 


then {C,} is bounded in probability. Since W2,, < A, + C, we have only to 
verify that « can be chosen to satisfy (2.16) and (2.18) in order to conclude 
that {W%} is bounded in probability. 

If Ep, < ER, then the choice « = 0 gives (2.16) and the condition of the 
Theorem guarantees (2.18). If Ep, 2 ER, take 


e = [((Ep + Eg2)/2| — ER, 
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« is clearly positive and (2.16) and (2.18) are satisfied because Eg, > Ep, . 
This concludes the proof of Theorem 1. 


Theorem 2 which we now prove shows the necessity of the condition of Theo- 
rem 1 when first moments are assumed to exist. 

Tuerorem 2: (a) Jf ER, 2 Ege then F(t) = 0. 

(b) If Ep, = Eg: then F(t) = 0. 

Proor: (a) is due to Lindley [4] who proved that, in this case, W, — + 
in probability. For (b) we might as well assume in addition that ER, < Eg. 
otherwise we can use (a). 

If ER, < Eg, then 
(2.19) —S, - B,, = — max [S, igs S,] 

Osisn 
is bounded in probability. From (2.13) 
(2.20) Wis = max [(T, — T, — 8; — B,] 
Osisksn 


= max [T, + S. —_ —_ S;] ~ S, — Bas 
Osisksn 


Now, recalling that R, > 0 w.p.1, 
max iT’. + S, -~ T; — S;] 2 max [T’,, oh S, _- T, — S,] 


O<igksn Osksn 


= max l > (p;5 — iss) + Rigi — Rays 


Osksn Limk+1 


Osksn _ime+1 


{Ras} is, of course, bounded in probability but, because Ep, — Eg, 2 0, 


2 max | - (pi re 9i+1) - Ras 


tomk+1 


(2.22) max | > @- as) | —-+o in probability. 


Osksn 


(2.22), (2.21) and (2.19) show that Wea — +. in probability which proves 
that F(t) = 0. This proves Theorem 2. 

It is interesting to note that if ER, 2 Eg. and Ep, < ER, then, although 
W, — + in probability, Ws has a legitimate limiting distribution. This is 
because we can show (as in Lemma 2) that P{ West |Z, = 0} has a limit 
and because of (2.13) 


Wes: S max [T, — 7.) + max [B, — B,] 
Osksn Osksn 


= max [7T, — 7;] 
Osksn 


which is bounded in probability. 
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Just as in Kiefer and Wolfowitz [3] we can write down an integral equation 
for the limiting distribution of Y,. Under the conditions of Theorem 1 this 
integral equation will have a unique probability distribution as a solution. The 
uniqueness argument in |3] is rather delicate but in this problem the difficulty 
is easily disposed of because of the ease in seeing (by means of (2.13) for exam- 
ple) that the limiting distribution must be independent of the starting point 
(W,, W:). 


3. The s-Server case. The question of ergodicity in the case of s servers can 
be handled in essentially the same fashion as in Section 2 where we had 2 servers. 
We shall be brief in those places where the generalization of the ideas in Section 
2 is transparent. 

For ¢ = 1,---, 8 let R%& be the service time in server o of the nth person. 
Let Rous = tas: — Tt. Where 7, is the time at which the nth person arrives to 
the first queue. Let & = Ri — Raji, o = 1,--:, 8. Let W% be the waiting 
time in the oth queue of the nth person. It is easily verified that, for alll S p S s, 


fi 
(3.1) Whe = max [0,W2 +0 +5 I + Ws en Wid | 
om) 


Let Tt = D041 tf and let D? = max*(—T}j, — --- — T},] where max* is maxi- 
mum over allO0 <j; S --- Sj, Sk. Let H2 = SPL + Wh — Wixi). To 
obtain a manageable expression for W%4; we will show that Ht = D2_, — D2 
for all p, n. Observe first that this is true when p = 1 and all n. Assume now 
that Ht = D2_, — D% forall n. We will show that H2* = D2ti — D2* for 
all n. 

From (3.1), the induction hypothesis, and iteration 


(3.2) WE = max[0, W2** + &* + HP] = max (0, W2** + tf** + DP_, — DP] 


= max (TE — T7*' + D? — Df}. 


Hence, using (3.2) fork = n— landk =n 
24 we — wet = 2+ max [T2ti — T7* + D7 — Dz, 


O0sign-l 


— max (T2** — 77" + D? — D%] 
Osian 


= max [—77*' + D?] — max [—77" + D7] 
O0sign-l Ossian 


+ Di — Di. = Dit — Di + Di — Di... 
Thus 
Hz" = i" + W2" — Weil + Hz = Det — De” 
which is what we wanted to show. Since H® is what we say it is (3.2) is valid 


for all k and p (of course since there are only s servers we have no use for WE 
where p > s). 
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Returning to (3.1) we remark that it is easy to verify just as in Section 2 


that Y, = (Wi,---, Wi, Ri,-: , R=") is the nth random variable in a 
stationary Markov process and that 


(3.3) PiWism,-:-,Wisa|Wi=0,¢ = l,-++, 8} + F(a, -++ , a) 


where F is an s-dimensional distribution function but not necessarily a proba- 
bility distribution function. 

Let wp = ER{ gm Oe. b, sar., @, 

TuHeorem 3: If 
(3.4) MAX pte < po 

lgess 
then F is a probability distribution. 

Proor: As in Theorem 1 we only have to show that each {W%} is bounded in 
probability. It is easy to see by a trivial induction argument that we only have 
to verify that {W%} is bounded in probability. Actually the argument we give 
is legitimate when s is replaced by p for any 1 S p S&S s. In any case we will 
only consider {W%}. 

To begin with observe that 


(3.5) —-De’ ST. +--+ + Ts". 
Hence from (3.2) with k = n,p = s — 1 


(3.6) Wisi S max [T. + --- + T — Tj + DF") 
Osisgn 


= max (7, —7Tj,+--- +7. — Ti, 
Ossi1S°°*Sie5" 


Let 8) = sand define s, to be the largest ¢ < s,, (o 2 0) with the property that 
(3.7) Me ~~ Magy > 0. 


o = 0 satisfies (3.7) because of (3.4) so that s; is well-defined. Let k be the 
first i such that s; = 0. Then it is easy to check that 


(3.8) Mo = Mey > Megs > °** > Mag = Me 
and that for s;5 < ¢ < 84 

(3.9) Me S Bay < Beg: 

Define, fori = 1, --+ , k, 


ui ie b- (1%, — v3.) | 


(3.10) OSi1S°** Sin" sit 


max ee (rT, - v1.) | 


OSia5415°** S5e5_, 3” ome; 
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Because of (3.6) we have 
(3.11) Win sUi+--- +H 


and, therefore, in order to show that {W%} is bounded in probability, we have 
only to verify that each U4(i = 1,---, k) is bounded in probability. 

The verification that each U), is bounded can be summarized in the following 
lemma. 

Lemma: For m = 1,--- , M let |X7,i = 1, -+-} be @ sequence of independent 
and identically distributed random variables with 


(3.12) EXT = X\w — Awt 
where 
(3.13) ho > Aw 2 maxdA, > mind, 2 0. 


O0<a<M 0<a<M 


(It is not assumed that {XT} and {XT'} are independent of one another). Let St = 
‘1X7 and let y, = max* {> -%, (St — ST,)] where max* is marimum over 
all ju, +++ ,ju wWthO Shi S++ Siu Sn. Then pf, is bounded in probability. 
It is easy to verify that U}, can be taken as y, and that the conditions of the 
lemma are satisfied ((3.9) giving (3.13)) so that the proof of the lemma is the 
last step in proving Theorem 3. 

Proor or Lemma: Let y = min’ [A; — A,] where min’ is minimum over all 
0s i,j S M with \; — A; > 0. Let db = y/M. 3 is, of course, strictly positive. 
For 2 3 m Ss M define én = Aw — Ama + (M — m + 1)8 and let ey., = 0 
and « = 0. Then 


(3.14) €2, °** , €w are positive, 

(3.15) Am — Awa + tmet — ts = — 4, 2smsM 
(3.16) Ar — Ao + & = Aw — Ao + [(M — 1)/Mhy < (—1/M) (Xe — Aw) < 0. 
Letting jus: = n and taking note of the fact that ey., = 0 we have 


M 
(3.17) 2 [(m — Jmii)emsr — (NM — Jm)em| = O 


where 0 Sj: S je S °° Sim S juss = 0. Now, using (3.17), 


Wn - max*| (St oe S35. + (n = Jmvt) Emi — (n oe jn)en)| 


<s 2d max* [Se = Si. + (n oo Jet) Emit . (n coe jm) em) 
> max [ST — Sh, + (n — jm) (emit — &n)). 


m=! Osimg* 


Each of the terms in the summation on the right hand side of (3.18) is bounded 
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in probability since 
max [St — Sj. + (n — jm)(€mii — €m)] = max [EF — EF] 


OS ima” Osksn 
k 


where & = Do'., (XT + ema: — em) and E(XT + engi — em) < 0 due to 
to (3.15) and (3.16). This concludes the proof of the lemma and, therefore, the 
theorem. 

THEOREM 4: If Maxi <e<s He 2 wo F is identically 0. 

Proor: Letting p be the first ¢ > 1 with 4 2 wo we need only show that 
Wi — + in probability. Using (3.2) 


Wi. = max (T2 — T? + D? — D2") 
jan 


max [T, — 7}, — T7\ -—--- — Ti, — DE" 


‘ Jp—i 
OSi15°-* ips" 


max [Tr + °°: +7, -— Tj, - vo — Til 
OS415--* Sips 


—- mes. [%+.---47'—-% —--- — TE" 


° Jp—il- 
OS115:+*Sipsn 


(3.19) 


The last term on the right hand side of (3.19) is bounded in probability because 


MAX 1<e<p-1 te < wo. Looking at the first term on the right hand side of (3.19) 
we have 


max [TR+-+-+7, — 77, -—-+- — T3,) 
OSi1S** Sips 


= max | - > (Rj — Roa | 


Osksn | jamk+1 cml 


n vy 
= max | >, (R? — R§) + 2d (Rigi — Rv | 


Osksn Limk+1 


n P 
= max ! > (Rk? - Ri) | — > Roi. 
Osksn Limk+1 o=l 
The last term written is bounded in probability while the preceding term goes 
to + in probability because u, = wo. It is then quite clear that W2,, must 

go to + in probability. 
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QUEUES FOR A FIXED-CYCLE TRAFFIC LIGHT 
By G. F. Newet 
Brown University 
1. Summary. In their book Studies in the Economics of Transportation, Beck- 
mann, McGuire and Winsten (BMW) ((2], pp. 11-13, 40-42) proposed a simple 
queuing model for traffic flow through a fixed-cycle traffic light. Although they 
derived a relation between the average delay per car and the average length of 
the queue at the beginning of a red phase of the light, they only indicated some 
possibile numerical schemes for evaluating the latter. Here we shall derive analytic 


expressions for the average queue length and consequently also the average delay 
under equilibrium conditions for the BMW model. 


2. Introduction. Several papers have been written on the subject of queuing 
at a fixed-cycle traffic light. Wardrop [7] and Webster [8] describe very extensive 
studies based upon experimental observation, computer simulation and semi- 
empirical theory with the theory based upon the assumption that the arrivals 
of cars at the light form a Poisson process. Uematu [6] investigated the queues 
for a model quite similar to that of BMW but was mainly concerned with the 
question of how long it takes an empty queue to reach some preassigned length 
for the first time. The present author also made a previous study [5] of delays 
but only considered arrival rates which were not too close to the critical value 
and used a more elaborate model than that considered here. 

In the model proposed by BMW, it is assumed that events such as the arrival 
or departure of a car at the traffic light may occur only on a set of discrete and 
equally spaced time points. The traffic light pattern is periodic in time with each 
cycle represented by a sequence of r consecutive time points designated as red 
points followed by a sequence of g points designated as green. At either a red or 
green point there is a probability a that one new car will arrive and a probability 
1 — a@ that no new cars arrive, these probabilities being independent of the 
number of arrivals at any other time points. No cars are allowed to leave the 
light at red points but one car leaves at any green point provided that either a 
new car also arrives at that time or the queue just prior to this time point is 
non-empty. 

From these rules it follows that the lengths of the queue immediately before 
time points define a non-stationary Markov chain in which at any red point 
there is a probability a that the queue increases by one car and a probability 
1 — a that it remains unchanged, whereas at any green point there is a prob- 
ability a that a non-empty queue remains unchanged and a probability 1 — a 
that it decreases by one. The lengths of the queue before corresponding time 
points of successive cycles of the light, however, form a stationary Markov 
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chain. If we let g, denote the length of queue just before the first red point of the 
rth cycle, the g, satisfy the recursion relation, 


(2.1) des. = max{q, + u, — g, 0}, 


in which u, represents the total number of arrivals during the rth cycle. The u, 
are independent random variables having a binomial distribution 


(2.2) Pr{u, = m} = (" ec ) (1 — a)" "a". 

Our problem here is to find the equilibrium distribution for g, . Once this has 
been found and E(q,) evaluated, the average waiting time per car measured in 
units of the time interval between consecutive time points can be evaluated 
from the formula derived by BMW, 


(2.3) w= r(l — a) (g +r) [E(q)/a + (r + 1)/2). 


Relation (2.1) is equivalent to the recursion formula for a queue with bulk 
service of g customers at a time. It has been studied previously by Bailey [1] 
and Downton [3], [4] when arrivals have a Poisson distribution and service time 
a x’ distribution (a special case of which is service at constant time intervals). 
Some of the analysis here for a binomial distribution of arrivals, particularly 
Section 4, closely parallels the analysis described by Bailey. 


3. Low Rates of Arrival. One method of determining the equilibrium distribu- 
tion of q, is to take any initial distribution, for example g, = 0 with probability 
one, and evaluate the distribution for q , g; , etc. from (2.1) and (2.2). If the 
average rate of arrivals per cycle is less than the maximum rate of departure, i.e., 


(3.1) a(r+g)<g 
then this sequence of distributions will always converge to the equilibrium dis- 
tribution. 


If, in addition to (3.1), the difference between these rates is larger than the 
dispersion of u,, i.e., 


(3.2) g — a(r +g) > [a(l — a)(r + g)}', 

then Pr{q, > 0} will be small compared with Pr{q. = 0}, Prig: > 0 and g, > 0} 
will be relatively much smaller yet, and the sequence of distributions for q , q: , 
etc. will converge rapidly to the equilibrium distribution, the more rapidly the 
larger the difference in the two sides of (3.2). 


If we take Prig, = 0} = 1, the next approximation to the equilibrium dis- 
tribution is 


Priq = j} = ae (1 — a)" a’”’, j> 90, 


(3.3) ; 
Pr{qe 1— 2 Prig = j}. 
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The evaluation of the distributions for q; , q« , etc. is straightforward but becomes 
quite tedious. 

Since, in most practical applications, r and g are in the range of 10 to 20, we 
expect that estimations of E(q,) in the limit r and g — © with r/g fixed will be 
of some value. In this limit, (3.3) can be used to approximate E(g,) whenever 


(3.4) u = (g — a(r + g)Ilrg/(r + g)J* > 1, 


a condition which excludes only a range of a in which the difference between a 
and the critical value, g/(r + g), is of order r?. Forr sufficiently large, this 
excluded range can be made arbitrarily small but if r = g = 10, for example, 
it is from a ~ 0.38 to 0.5 and for r = g = 20 from a ~ 0.42 to 0.5. 

From (3.3) we obtain for0 <j«r 


Prig: = j} 


(3.5) +o — a)‘a’| ra_ |’ —J te) 
{etna ate} (AG + 0(0) 


and 


ae r+l a’™ 
(36) E(@) = (r +9 + 11 a)" 


: — [1 + O(u™)). 
girly? 


If we disregard the smaller values of a and assume that » < r'’*, then (3.6) can 
be simplified further by using Stirling’s formula and expansions of log a in powers 
of uw to give 


4 jut 
(3.7) E(q) = lee 3] exp(=1'72) fi a: (“’) or “] 


For «’ > 1, we can also estimate that E(q,) will differ from E(q) only by an 
additional term that is smaller than E(q) by a factor proportional to 
exp(—y'/2). 

Whereas in practical applications, the error terms in (3.6) or (3.7) may be 
quite significant, these equations at least give an accurate description of what 
happens for sufficiently large r and uw and a qualitative description even for 
moderately large r. In the range » > 1, E(q2) is a monotone increasing function 
of @ and is of order r’ for u = O(1). For r > 1, w is a rapidly varying function 
of a and as a decreases E(q.) also decreases very rapidly. Even for the largest 
a at which we may apply these formulas, however, where E(q,) is of order r’, 
the effect of the queue on w is small because in (2.3) E(q.) must be added to 
another term that is of order r. For z = g = 10, E(q,) causes only about a 20% 
increase in w even when yu = 1. 


To investigate what happens for » < 1, we consider below a different method 
of evaluating F(q.) 
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4. Use of Generating Functions. Let 
(4.1) GAz) = 20 2’ Prige = jj 
=) 


denote the probability generating function (p.g.f.) for q,. From (2.2) u, has 
the p.g.f. (1 — a + az)" and, since u, and q, are independent, u, + q, — g has 
the p.g.f. (1 — a + az)"**z °G,(z). If we subtract from this the probabilities 
for negative values of u, + g, — g and reassign them to the event q,., = 0, we 
obtain from (2.1) the p.g.f. for qz4: , 


g—1l 


g-1 
(42) Giz) = a —a+taz)'"G(z) — > a | +> au, 
k=@ k=O 


in which the a, are the Taylor expansion coefficients of (1 — a + az)'"G,(z). 
If there is an equilibrium distribution for the queue length with G,,,(z) = 
G,(z) = G(z) then (4.2) gives 


(4.3) G(z) = Q(z) E > au — > a | 


k=0 


with 
(4.4) Q(z) = 2 — (1 —a+az)"™”. 


We do not know the a unless we know G(z), but (4.3) and (4.4) at least 
describe the form of G(z), a polynomial of degree g divided by another poly- 
nomial of degree r + g. We also know that, if G(z) is a p.g.f., it must be analytic 
in the unit circle | z | < 1 of the complex plane, and in particular at any points 
in this circle where Q(z) has a zero. 

Since Q(z) is analytic, the number of zeros of Q(z) inside or on the circle 
| z| = 1 is equal to g plus the number of cycles through which the complex phase 
of 2-°Q(z) changes when z traverses a path just outside the unit circle, or equiva- 
lently g plus the number of times the image of this path under the transformation 
2 °Q(z) encircles the origin. Since for | z! = 1 and0 < a < 1 


}2"Q(z) —1] =|1—ataz\"" <1, 


with the last equality sign valid only at z = 1, the image of the unit circle 
itself passes through the origin once as z passes through z = 1 but otherwise lies 
to the right of the origin. Whether or not z °Q(z) encircles the origin as z traverses 
a path just outside the unit circle is, therefore determined by what happens to 
z "Q(z) for z in the neighborhood of z = 1. By expanding z °Q(z) in a Taylor 
series about z = 1, one can easily show that as z passes to the right of z = 1, 
z °Q(z) passes to the right of the origin if a(r + g) < g and so fails to encircle 
the origin but passes to the left of the origin thereby encircling it once 
if a(r + g) 2 g. We conclude from this that Q(z) has g zeros inside or on the 
unit circle if a(r + g) < g but g + 1 zerosif a(r +g) 2 g. Since a(r+g) <g 
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is also the condition for existence of an equilibrium distribution of g, , only this 
case is of interest here. 

If G(z) is to be analytic for |z| S 1, each of the g factors (2 — 2z;) of Q(z) 
with | z;| S 1 must cancel a corresponding factor of the gth degree polynomial 
in the numerator of G(z) and G(z) will reduce to the form 


G(z) = AJ] (z-— 2)", 
l=l 


in which z,,/1 = 1, 2, -- , rare the r zeros of Q(z) with | z;| > 1. Since, in addi- 
tion, any p.g.f. must satisfy the condition G(1) = 1, we finally obtain 


G(z) = Ila - 3;)(2 = zi)" 
lel 


E(qz) = dG(z)/dz\,1 = a (z,— 1)”. 
lel 


The study of the g, distribution is thus reduced to a study of the roots z; of Q(z) 
with |z,| > 1. 

It is not generally possible to obtain explicit expressions for the roots z;, 
but they must all lie on a curve of the complex plane defined by the equation 


(4.7) |2| = | 1 — + az rr, 


For any specified direction of z in the complex plane, one can sketch the graphs 
of the two sides of (4.7) as a function of | z | and show that for a(r + g) < g, 
the two graphs always intersect twice, once for |z| S$ 1 and once for | z| > 1. 
The curve of (4.7), therefore, consists of two closed paths C’ and C such as shown 
in Fig. 1, one lying inside the unit circle and the other outside. 

The roots z; must also satisfy the equation 


(4.8) st” = yi(1 — + az,;)"*?"", 


with y; = 1, and one can show that there is one and only one root of (4.8) on 
the curve C of Fig. 1 corresponding to each of the r distinct values of y; with 
vi = 1. By suitable numbering of the roots z; we can choose 7; so that 


(4.9) vi = exp[2xi(l — 1)/r]. 


We can also interpret (4.8) as a one-to-one mapping of the r roots on C into 
the r values of y; equally spaced around the unit circle. If we let r and g — 
keeping r/g and a fixed, the curve C also stays fixed but the values of 7; become 
densely and uniformly distributed on the unit circle. At the same time, the 
roots z; become dense on C. 

We know already from Section 3 that for the above limiting process E(q,) — 0. 
This can also be derived from (4.6) by observing that for r — the sum in 
(4.6) becomes the Riemann sum for an integral which we may interpret as 








Fia. 1 


either an integral with respect to the continuous real variable / or with respect 


to the complex variable y around the unit circle. If we choose the last form 
we find 


(4.10) r"E(q:) > 1/2ni | trle(v) — 1]}~ dy. 


The function z(7) defined by (4.8) for | y | =f 1 is also defined for | y > 1. For 
| | = 1, 2(y) is analytic, | z(y) | > 1, and is of order y for y > ». The contour 
integral in (4.10) therefore vanishes by virtue of Cauchy’s theorem. In addition, 
the difference between the Riemann sum of an analytic function and the integral 
over any closed path is asymptotically smaller than any finite power of the 
spacing between points. E(q,) is, therefore, smaller than any finite power of 
r” for r— o. 
If we define z; for non-integer real | through (4.8) and (4.9), it follows that 


r+} 
l (2z:-— 1)" dl =0. 
By dividing this integral into r parts and subtracting it from (4.6), we can 


also write E(q,) as the difference between a Riemann sum and its limiting in- 
tegral, i.e. 


+4 ) 
(4.11) E(q.) = 24 (21-1) - bs (2) — 1)" dl}. 


5. Nearly Critical Arrival Rate. Equations (4.6) and (4.11) are particularly 
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well suited to the evaluation of E(q,) when a — g/(r + g) because in this case 
we find that z,; — 1 and the one term of (4.6) for 1 = 1 becomes infinite. If, how- 
ever, we let r— ~ and a—>g/(r +g) simultaneously then some of the neighbor- 
ing roots to z, , for example z, and z, also approach 1. 

Since z, is defined by (4.8) and (4.9) also for negative values of | and is periodic 
in | with period r, we may consider / in the range —r/2 < 1 s r/2, for example, 
so that the roots nearest to z, are those with small |/|,1 = --- —1,0,2,---. 
To locate these roots we take the logarithm of both sides of (4.8) and expand 
in powers of (z, — 1) and yu to obtain 


—4ni(l — 1)(r + g)r tg — Q(z, — 1)(r + g)*(rg) “u 
+ (2:— 1)? + Of(2,—1)'ur*, = (2 — 1)")} = 0. 
The roots of this approximately quadratic equation with | z;| > 1 are 
(5.1) 2-1 = (r+g)*(rg) {a + [u? + 4ei(l — 1))) + Of(z: — 1)" 
and in particular 
(5.2) a — 1 = 2(r +9) (rg)? w + Ol(r + g)r "9 ‘4’. 
We conclude immediately from this that, if r and g are finite, 
E(qe) = (4 — 1)" + O(1), for u» > 0, 
rgi2(r + g)lg — a(r+g)]} + O(1), fora—g/(r +g), 


(5.3) 


in which O(1) here means order relative to u as u — 0 but not relative to r and g. 

Suppose we now let r — « with yu fixed, particularly with » < 1 since this is 
the only case that could not be handled satisfactorily in Section 3. Except when 
l <r and z; ~ 1, the difference between (z; — 1)™ and its integral between 
| — }.and/ + } is of the order of magnitude of the second derivative of (z,; — 1)™ 
with respect to 1, which in turn is of order cf according to (4.8) and (4.9). The 
sum of all such terms in (4.11) is at most of order r“ and so any significant 
contribution to (4.11) can come only for the small values of |1| where (5.1) 
is applicable. From (4.11) and (5.1), we obtain 


rg iS 2 . 4)-1 
(5.4) r+ g la 


+4 
- I, (u + [uw + 4ni(l — 1) al) +001) 
} 


with the dominant error team of O(1) relative to r coming from the error term 
of (5.1), particularly for] = 1 and to a lesser extent from the other / with | 1 | <r. 

The terms in the sum (5.4) are ocr) for |l| > uw, so the series converges 
rapidly enough to be of practical use even for u ~ (4x)' ~ 3. For small p, the 
main contribution, however, comes from 1 = 1 where the first term in the bracket 
of (5.4) is (2u)~ while all other contributions to the series are at most of order 
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1 even for u — 0. Generally we obtain for yw of order 1 or less 
(5.5) E(qz) = (rg/(r + g)J{(2u)* + 0(1)) 


and for » < 1 we can expand (5.4) in powers of yu to obtain 


(5.6) E(q:) = ( a) E -A+& +014") 
7 - r+g 2u 4 
with 


k 
(5.7) A = (2n)7 lim (2(R + 4)' — Sr) ~ 0.582. 
R+x lal 


One can estimate that O(u’) is roughly —y’'/20 and in succeeding terms the 
important parameter is 4/(47)', so that (5.6) will be correct to within about 30% 
even for « = 1. The error in (3.7) for «1 = 1 should be of comparable size and if 
one compares (5.6) with (3.7) one finds that they agree to within a factor of 
about $ forr— © and yz = 1. 

Since for r — , the effect of the queue on w will not be significant unless 
E(q-) is of order r, this will not occur unless u is O(r *) and g—a(r+g) = O(1). 
If, in fact, g — a(r + g) = 1 the queue causes w to increase by a factor of 2. 

We note finally that for certain values of (r + g)/r, namely 2, $, 3, $ and 4, 
one can obtain exact expressions for the rocts z; by virtue of the fact that (4.8) 
gives a set of quadratic, cubic or quartic equations. One can, therefore, also 
obtain exact explicit formulas for E(q,) and w. If, for example, r = g, then 


(5.8) 21-1 = Fyi'a {1 — 2ey: + [1 — 4a(1 — a) 


in which the square root must be chosen in the right half of the complex plane 
to give | z:| > 1. In particular 


(5.9) z,—-1=a (1 — 2a). 


6. Comparison with Webster’s Formula. The only formula with which we can 
compare the above results is Webster’s semi-empirical formula [8] for delays 
which is based upon the assumption that the arrivals form a Poisson distribution 
rather than a binomial distribution as assumed here. Webster’s formula consists 
of three terms; the first is essentially the same as (2.3) with E(q,) = 0 and repre- 
sents the delay for regularly spaced arrivals; the second term is the delay that 
results from a queue when arrivals have a Poisson distribution but the service 
time is a constant equal to (r + g)/g time intervals; and the third term is an 
empirical correction obtained by fitting curves to values calculated by computer 
simulation. 

Since a Poisson distribution allows arbitrary small time intervals between 
arrivals, fluctuations may cause more cars to arrive in some green period than 
can leave. Because of this one finds for a Poisson distribution of arrivals that 
even when r — 0 (no traffic light) one still has a queue and furthermore the 
average length of the queue becomes infinite as the arrival rate approaches the 
critical value. As pointed out by BMW, the binomial distribution has the ad- 
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vantage of forcing a minimum spacing between cars and so we avoid this un- 
fortunate limiting behavior, even though this is accomplished in a somewhat 
artificial way wherein the spacings are confined to be integer multiples of the 
minimum spacing. 

By using methods very similar to those described in Sections | to 5, it is pos- 
sible also to compute the queue lengths and delays when the arrivals have a 
Poisson distribution, provided we assume that (2.1) still holds. We need only 
replace the p.g.f. for the binomial distribution of u, by the corresponding ex- 
pression for the Poisson distribution. By doing this one finds as the analogue 
of (5.5) 


(6.1) E(q.) = 49g — a(r + g)J" + OC), 


the leading term of which is (g + r)/r times as large as in (5.5). The average 
waiting time for nearly critical arrival rate is then given by 


(6.2) w=r/{2(1 — a)(g+r)} + (9 +7)/{2g — a(r + g))} + O(r). 


Pustherseoms, in (6.1) and (6.2), the O(r’) are asymptotically proportional! 
to r’. 

The first term of (6.2) is essentially the same as the first term of Webster's 
formula and the coefficient of [g — a(r + g)]~ in the second term has the same 
value as in Webster’s formula for a — g/(r + g). The third term of Webster’s 
formula, however, is not asymptotically proportional to r’, nor does it indicate 
in any obvious way the importance of the magnitude of g — a(r + g) as com- 
pared with [rg/(r + g)]’. 
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PROBABILITY CONTENT OF REGIONS UNDER SPHERICAL NORMAL 
DISTRIBUTIONS, I' 


By Harotp Ruspen 
Columbia University 


1. Introduction. The primary purpose of this series of papers is to attempt to 
lay the groundwork for a relatively well-rounded theory of the spherical normal 
distribution. Many distributional problems in mathematical statistics may be 
regarded as particular instances of one general problem, the determination of 
the probability content of geometrically well-defined regions in Euclidean N- 
space when the underlying distribution is centered spherical normal and has 
unit variance in any direction. Specifically then, we require for a definite region R 


(1.1) P(R) = (2x) fe * dx, 

xe 
in which x’ = (2,,-°--+, tw). The class of problems represented by (1.1) is a 
very broad one and the literature on it is correspondingly quite enormous and 
well-diffused. In fact, all the distributional problems which occur in the theory 
of sampling from multivariate normal populations may in principle be brought 
under our general heading. Thus, let y;, i = 1, 2, --- , n, denote n mutually 


independent k-dimensional vectors each of which is governed by the elementary 
probability density 


(1.2) p(y) = (Qe) *| Vie, 


The joint probability density function for the n vectors is []} p(y.) and integrals 
of the form 


(2x) | v\-* Jo exp (— 3 X viV"y.) I] dy; 


(13) 
= (2e) |W | exp (— 42’ Wz) dz, 
zeT 


where z is a partitioned vector, W is a partitioned matrix, 
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N = nk and T a specified region in Euclidean N-space, may be thrown in the 
form (1.1) by a linear orthogonal transformation chosen so as to orient the new 
axes along the axes of the ellipsoids of constant density of the distribution of z, 
followed by a simple scaling transformation to convert the ellipsoids into spheres. 

We recall a second and frequently more convenient method of reducing z’W ‘z 
to a sum of squares. By means of triangular resolution, W may be factored [1] 
in the form 


(1.5) W = MM’, 
where the N X N matrix M is defined by 


(1.6) 


and V = LL’, L denoting a k X k lower triangular matrix. On setting 
(1.7) z= Mx, 


(18) (ae) wy [ _exp(—42'W"'2) dz = (2) [ _ op(—4x'x) ds, 


where R = M™'(T). 

In view of the preceding discussion no loss of generality results in assuming, 
whenever necessary, that the distribution is given by (1.1). 

We shall list, briefly review and discuss a number of important distributional 
problems, together with some applications, which are formally reduceable to 
integrals of the form (1.1). In the first few illustrations, the regions FR constitute 
relatively simple geometrical entities, such as half-spaces, hyperspheres, hyper- 
cones and hypercylinders, for which the statistical applications are both classic 
and familiar, but in later illustrations more complex bodies, such as ellipsoids, 
simplices and polyhederal cones, are considered. In particular, the last named 
case of polyhederal cones, corresponding to the difficult and important problem 
of the multivariate normal integral, and more especially the bivariate normal 
integral (when the dimensionality of the polyhedral cone is 2), will be investi- 
gated in some detail. 

Integrals of the form (1.1) are rarely capable of being expressed in closed form 
using well-known functions. Nevertheless, it is hoped that the current presenta- 
tion will provide a unifying thread and thereby help to stimulate further re- 
search. In the sequel a quite powerful method, referred to as the “method of 
sections,”’ will frequently be used to deal with the integrals. This consists in 
dividing up the region R by means of a series of parallel and adjoining (N — 1)- 
flats and in the exploitation of the following fundamental property of the spheri- 
cal normal distribution of dimensionality N: The conditional probability dis- 
tribution in any linear subspace of dimensionality N — k (k = 1,2,---,N — 1) 





600 HAROLD RUBEN 


is itself spherical normal with dimensionality N — k and with variance in any 
direction equal to the variance of the original N-dimensional distribution. Let 
O be the center of distribution, P any point in R and M the foot of the perpendicu- 
lar from O to the flat through ?. Further, let OP = r,OM = §&, PM = 4, with 
r = & + 9°. Then the p.df. at P is 


(20) Me = (Qe) te x (24) tM 


’ 


and the distribution in the flat through P is spherical normal with dimensionality 
N — 1. It follows that the probability content of the infinitesimal region inter- 
cepted by R between two parallel flats distant — and — + dé from O is of the form 


(1.9) (2r) te deQce; R), 


where Q(£, R) is itself obtained by evaluating an integral of the form (1.1), 
with N replaced by N — 1. Consequently, 


& 
(1.10) P(R) -{ (27) 4e"Q(E; R) de, 


6 

where £ and ¢, are the distances of the bounding flats to R from O. If, further, 
the section of each cutting flat is a region of the same geometrical type as 
R (e.g. R an ellipscid and the section an ellipsoid), with the center M of the 
(N — 1)-dimensional spherical! distribution in the flat bearing the same geo- 
metrical relationship with respect to the (N — 1)-dimensional figure as does O 
with respect to R (e.g. both O and M are centers of ellipsoids), then (1.10) 
becomes an integral recurrence relationship (see Sections 7 and 8). 


2. Probability content of a half-space. The probability content of the infinite 


parallel slab R defined by p, S >.} az; S pm is given directly by the method 
of sections as 


p2/(zaz)4 : 
(2.1) (ar) [ et dg. 
pi/(zaz)s 
Here the flats dividing R are taken parallel to the bounding flats and Q(&; R) = 1, 


& = pi/( >, a3)", & = p/(>; a’)! In particular, for the lower half-space 
fi = —® and (2.1) becomes 


p2/(za*)) ‘ 
(2.2) (ar) | ee dé. 


These results are, of course, a reflection of the fact that > a,x; is distributed 
normally with zero mean and variance ) a’. 


3. Probability contents of centrally and non-centrally located hyperspheres. 
Historically, the central x? distribution was one of the first directly entailing 
probability contents of regions in N-space when the density is spherical normal.* 


2A geometrical derivation of the x? distribution for 3 degrees of freedom is implicit in 
Maxwell’s great work [2] concerning the energy distribution of gas molecules. Each of three 
orthogonal components of velocity have identical and independent normal distributions 
with zero mean, and the energy, suitably standardized, is a x? with 3 degrees of freedom. 
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For the central x? distribution the region in question is a sphere whose center 
coincides with the center of the distribution, while for the non-central distribu- 
tion the region is a sphere whose center is non-coincident with the center of the 
distribution. 

Let xx and xy.. refer generically to a variate distributed as x’ with V degrees 
of freedom and a non-central x’ variate with N degrees of freedom and non- 
centrality parameter «x, respectively (xx.0 = xv). The latter variate is defined by 
xvie = >.1 (2; — «;)’, where the z; are independent normal variables with zero 
means and unit variances, and x° = > —)«?. Further, denote the distribution 
functions of these two variates by Fy(a’) and Gy.,(a’). Correspondingly, lower 
case letters shall denote the p.d-f.’s. Then 


N 
Fy(a’) = P(xx S @’) / tee [ (20) exp(-1 22) ae +++ dty 
i 


y 
R: 2 aise? 


[ i. / (2x) exp(—4r’)r" dr dw 


Sw(1) [ (Qe) exp(—4r*)r** dr 


2 (2! (3)) [ exp(—4}7") (1°) dr’, 


where dw is the solid angle subtended at the center of the distribution by an 
infinitesimal volume element and Sy(c) is the surface-content of a hypersphere 
of N dimensions with radius c, 

(3.2) Sw(c) = 2n'%c"/T(N/2). 

This gives the usual Incomplete Gamma Function for the distribution function. 
On differentiating with respect to a’, 

(3.3) fu(a®) = (2"1(N/2)T* exp(—4a")(a°)”. 


Pedagogically, perhaps a more useful geometrical derivation is to “slice’’ up 
the sphere into infinitesimal thin slices by a set of parallel planes. This corre- 
sponds to a proof by induction (ef. [3], pp. 247-8). Let z denote the distance of 
a typical slice from the center of the sphere. Then 


N 
Fx(a’) = | sad [ (ae) exp(- 5 > 21) dx, «++ dty 
R: Eat se? 
- / (2x) exp(—42°)-Fyi(a’ — z*) dz, 


on noting that the density at a point on the “z-slice’”’, which intersects the given 
sphere in a sphere of dimensionality N — 1 and radius ( a’ — z’)', distant y 
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from the center of the latter sphere, is (2x) ~”’” exp[—4(2* + y’)]. Ondifferentiat- 
ing with respect to a’, 


fu(a’) = [ (29) exp(—42")fyla’ — 2°) dx 
on [ey exp( — 42’) 
[aon(X YT exp{—}(a’ ie 2) \(a mig Pam dz 


exp(—ba!)| (2e)'2!*r AS )Pr (a pt x) 3) i 


° (2"r (7) * exp(—}a") (at), 


For the non-central x’ distribution, we require the distribution of >°! (2; — «;)’, 
where the z; are mutually independent standardized normal random variables. 
Let O be the center of the distribution which is taken as before to be the origin 
of coordinates and K the point (x, x2, ++: , xv). Let P be any point with co- 
ordinates (x; , 22, -** ,2w), letOK =«,« = (ei taat--+ + xy)’, KP = &, and 
let the angle between KP and the line OK, produced in the sense O to K, be @. 
Then 


N 


> zi = OP = 2° + & + Anz cos 8, 
1 
and 


Gy,(a’) 


- dty 


= f [ (Qe) exp[—4(«° + e + 2xé cos 6)\e"" dt dw 


= (20) exp(—4’) [ [ exp(—}é" — xé cos 0)E* dé du. 


‘ exp(—xé cos 0) dw = 2x (4ut)~ 9 Taya (Ke), 
0 


where J,(z) = « “J,(iz) is the Bessel function of the first kind with purely 
imaginary argument. This follows directly by dividing up the surface of the 
hypersphere into annuli d@, the content of such an annulus being Sy_,(sin 6) dé. 
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Thus, 
[ exp(—«t cos 0) dw = [ exp(—«ét cos 6)Sy_,(sin @) dé 


= ie r C =) [ exp(—«é cos 6) sin” * 6 dé, 


and this integral is related to the Bessel function 


1 
Ia(2) = [Vat (n + G2)" | exp(ter)(1 - vt)" dv (Rw + §) > 0), 


by setting v = cos 6. Alternatively, exp(—«é cos 6) may be expanded as a power 
series in cos @ after which integration is effected term by term. Hence, finally, 


Gy(a’) = (29) exp(—}x’) 
(3.5) A “1 ow —(n-1) 
[ exp( — }¢°)g"*- 2a” (hue) Tyw—a( ne) dé 


and 
gv;(a) = $x exp( —4e’)a” exp( — 4a”) Iywa(«a) 
(3.6) = 2 exp( —4x*) (a’)” “exp( — 4a") > (1/T(an + r)) 


((4ua)”/ri}. 


The above geometrical derivation seems to have been used first, in essence, by 
Patnaik [4). 
An alternative and simpler geometrical method consists once again in dividing 


the sphere R by a set of parallel hyperplanes. Take these to be perpendicular to 
the line OK and let z be the distance of a typical plane from K. Then 


Gy,(a’) = [- . -f (2e) exp (- Ly “') da, +++ dty 


BE (2,—44)* S08 


(3.7) = [ (2) exp[—4(« + 2)"Fvala’ — 2) dz. 


Hence, 


Qv.(a) = exp(—}x) [ exp(—«r)- (2) exp(—42")fyala’ — 2*) dr 


exp(—}«) exp(—4a’) / [ (ae ato (7 "| 


[ exp(—«r)(a’ — 2*)** dz 


=}, %” exp(—}<’)a”™ exp(—4a’) Iyw-1(xa), 
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after substitution for fy.,(a’ — 2’) from (3.3) and using the above integral 
formula for J,(z). 

Some exact values for the probability integral of the non-central x’ are available 
in [5] and [6]. A more extensive set of values is provided in Fix’s tables [7] de- 
signed to yield the power function of x’. For studies in connexion with suitable 
approximations to the non-central x’ distribution, reference is made to [4], [8] 
and [9]. Finally, various tables of the non-central x’ distribution for the special 
case of two degrees of freedom have become available in recent years for applica- 
tion in ballistic problems ({10}, [11], [12], [13]}). 


4. Probability content of a symmetrically and asymmetrically located hyper- 
spherical cone. Consider a hyperspherical half-cone R with vertex at the center 
of the spherical normal distribution. Let the angle between the axis of the cone 
and a generator be @. The probability content P(R) of the cone is give. by its 
relative solid angle, i.e., the ratio of the surface-content of the region, a cap, on 
a unit sphere with center at the vertex of the cone which is demarcated by the 
cone to the surface-content of the entire sphere. Hence, by division of the cap 
into a set of annuli with radii sin 6’, 


6 
P(R) [ Sys (sin 6’) d0’/Sy(1) 


(3)/[ve (af 
= sun (“E+ ,4), 


where g denotes the Incomplete Beta Function Ratio. 
Define 


(4.2) twa = &/(xka/(N — 1))', 


where é is a norma! variate with zero mean and unit variance distributed inde- 
pendently of xv-1, & x variate with N — 1 degrees of freedom. The variate 
ty. may be expressed in the form 


N ; 
(43) ty = a/ (x ai/(N - » , 


where the z; (¢ = 1, 2,---, N) are independent normal variates, each with 
zero mean and unit variance. The region ty_, 2 c(c 2 0) defines a half-cone with 
vertex at the center of the distribution of the z; and with axis oriented along the 
z-axis. The angle between the axis and a generator is @ = arc cot [c/(N — 1 )*}. 
The distribution function of ty_,; is given by (4.1), where here 

(4.4) sin’ @ = 1/{1 + ¢/(N — 1)}. 

The density function —dP/dc is 


ee os tae as al é —{N 
(4.5) gv-(e) = G 1) B( >? s) | (1 + wa :) ‘ 
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The simplest application of the ¢-distribution relates to the “Studentized”’ 
mean of a normal sample. The geometrical derivation of the latter quantity is 
well-known (see e.g. [3], pp. 239-40). The relevant hyperspherical cone in this 
instance has its axis along the line z; = 2, = --- = 2y equally inclined to the 
coordinate planes. The above argument implies, however, that the probability 
content of a hyperspherical cone of given angle and with vertex at the center of 
a spherical normal distribution is independent of its orientation. 

Consider next a hyperspherical cone whose vertex does not coincide with the 
center of a given spherical normal distribution but whose axis passes through 
the latter point. As before, let the angle between the axis and a generator be @. 
The probability content P(R) of the cone may be obtained by considering 
sections perpendicular to the generator. Each such section is a hypersphere and 
the surfaces of equal density in the flat forming the section are hyperspheres. 
Hence’, 


(46) P(R) = [ (2r)' exp (— $2”) Fwal(x — A)’ tan’ 6) dz, 


where A is the distance of the vertex from the center of the distribution, and 
Fy_.(-) is defined in (3.1). 

Define 
(4.7) tyr = lé — AVIixea/(N - 1)}, 
where ¢ is a normal variate with zero mean and unit variance distributed inde- 
pendently of xy, , a x’ variate with N — 1 degrees of freedom (ty. = tw-1,0). 
The variate ty_;., may be expressed in the form 


(48) tras = bi — 1 / [= 21/(N — 1)] 


where the z,(i = 1, 2,--- , N) are independent normal variates each with zero 
mean and unit variance. The region ty.s,, 2 c(c 2 0) defines a half-cone with 
vertex distant \ from the center of the distribution of the z,. The axis of the 
cone is oriented along the z-axis and the angle between the latter and a generator 
is arc cot [e/(N — 1)*]. The distribution function of ty_;., is given by (4.6) with 
@ = are cot [e/(N — 1)']. Setting y = z — d and differentiating with respect to 
y, the density —@P/dc of ty_s,, at c is obtained immediately as 


Grid (c) = Qnr-i (c) exp (— pr (*)| 


4 
’ 


(49) 


. [8 exp (= 2 - AV 00s 6) ae, 


where 1’ sec’ 6/2 has been replaced by z. This density function may be expressed 
in terms of the Hh function [14], defined by 


Hh,(y) = [te"/mi exp [-402 + y)"} dz, 


* This argument incidentally provides an alternative basis for the determination of the 
distribution and density functions of ty_, ,i.e., when A = 0. 
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as follows: 


N—1:) = wai ( T(N an( ZY] 
(4.10) gv-aale) = gv-a(c) T( | 5 


-exp (— 4)’ sec” @)Hhy_(d cos 6), 
6 being given by (4.4). Alternatively, term by term integration yields 
exp (— })’) au: > (— rAv/2)’ 


gv-1a(c) a ———_______— 


r(X tY cw Wepre ool 


2 
N+r e rf ec ; Me 
. FS) Serwy 


(4.11) 


The “‘tail-end area,” obtained after term by term integration in (4.11), yields 


_ xp (— ') & , 
P(twin 2c) = az 2D Tar + 1) 


id (* = r+!) (— »v2)" 
rr. 2s? Ss r! 


(4.12) 
(ec 2 0) 


(ef. [15]). Tables of the non-central ¢-distribution have been provided by Neyman 
and Tokarska [16], Johnson and Welch [17] and, more recently, by Resnikoff 
and Lieberman [18] together with applications. The simplest application relates 
to the power of the test based on the Studentized mean-statistic from a normal 
sample. The axis of the relevant cone is then along the line z, = z, = --- = zy. 
The above argument implies, however, that the probability content of a hyper- 
spherical cone of given angle and with axis passing through the center of a 
spherical normal distribution is independent of its orientation provided that the 
distance between the vertex and the center of the distribution remains fixed. 


5. Probability content of a region bounded by a variety of revolution of dimen- 
sionality N — 1 and of species p. Denote a hyperspherical surface (manifold) of 
dimensionality m by S,,. Then a variety of revolution Sy_, of dimensionality 
N — 1 and of species p is defined by the rotation of a Sy_,. , imbedded in a 
(N — p)-flat Ay_, (linear space of dimensionality N — p), rounda (N — p — 1)- 
flat Ay_p1 in Ay_, as axis (see e.g. Sommerville [19], pp. 137-8). The axial plane 
of revolution Ay»: may be regarded as defined by N — p fixed points in a 
(N — 1)-flat Ay, imbedded in N-space (Ay_,-; is a linear subspace of Ay_.:) 
which has therefore p degrees of freedom and can rotate about Ay_,—; in such a 
way that each point of Sy_,., generates the surface of a hypersphere with di- 
mensionality p + 1, the latter surface itself being of dimensionality p. The 
center of the hypersphere is determined by the foot of the perpendicular to 
Aw—»—1 from the given point. 

If the equation of the generating surface Sy_,_, , referred to N — p rectangular 
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axes in Ay_, of which N — p — 1, designated the z,-axis, --- , y»-:-axis, are in 
Aw—p— , 18 given by 

(5.1) In—p = O(%1,%2,°** , typ), 

then the equation of the generated surface Sy_, is 

(5.2) ty—p + Trp + +++ + th = O(T1, M2, °** , Twp), 


since the expression on the left of equ. (5.2) represents the squared perpendicular 


distance of a point in Sy_,, , when the latter is in a rotated position, from the 
axis Ay_p.. 


We shall now determine the probability content of the region R in N-space 
obtained by replacing the quality sign in equ. (5.2) by the sign S, under the 
assumption that the distribution of the z;(i = 1,2, --- , N) is governed by (1.1). 

Let O be the center of the distribution, P the point (2, z,--+:, Zw) on 
Sy—p». and P’ the point (2, --+ , typ, 0,-++, 0), P’ being the foot of the 
perpendicular to Ay_,, from P. The locus of P on rotation of Sy_,, is a hyper- 
sphere with radius $(2; , 22, --- , Zy-p.). Consider the infinitesimal element of 


R which projects into the element dAy_,; located in Ay_». around P’, Since the 
density at any point is 


(2x) exp (- } > #) = (Qn) **-?” exp (- 4 ee #) 


-(2n) 4?" exp (- } : “), 


the distribution in the p-flat containing the locus of P is spherical normal with 
center P’. Hence, by (3.1), the probability content of the element is 


N—p—1 
(24) 7%"? exp (- ; > #) F pus(O(21, 22, °** »2w—-pt)) dAwpa, 


on recalling that PP” = $(m, %2,°°", Iw—p-1), Where F,,;,(-) denotes the 


distribution function of a chi-square with p + 1 degrees of freedom (equ. (3.1) ). 
The required probability content is then 


(5.3) fovef (2a) 0? exp (- +e zi) 


PF ys1(O(21, Ze, *** , 2w—p1)) dAw_p4. 
Consider in particular the case 
(54) O(t1,°++, tv-pa) = [(p + 1)/K(N — p — 1)Mai + --- + thy) 
when the region R becomes 
(5.5) ((2i+ --+ + 2ypa)/(N — p— I)V{(ze-o + ++ + 28)/(p +1) 2 ke 


Equation (5.1) now represents a hyperspherical conical surface in Ay_,, while 
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equ. (5.2) represents the surface obtained by rotation round Ay_, . The gener- 
ated surface is characterized by the property that the radius vector OP isinclined 
at a constant angle are cot (k(N — p — 1)/(p + 1))' to the linear space 
Ay». Furthermore, in (5.3) dAy_». may conveniently be chosen as the an- 
nulus between two concentric hyperspherical surfaces in the Ay_».-subspace of 


radii £ and & + de(¢’ = > °Y~?-' z? = OP”), and the probability content of R is 
then 


- (N—p—1) ,.N—p-2 \ 
6 ( —}(N—p—1) —_— 2x E , li (p + LE | 
(56) l 2n) exp (— 42°) (M=P=) 5 Po | EN => 1 | 
2 


on using equ. (3.2) to substitute for dAy_p. . 
The expression (5.6) represents 1 — Py—»i.»4:(k), where Py—p+.pii(-) de- 
notes the distribution function of an F-variate, Fy_p-1,94: , with N — p — 1 and 


p + 1 degrees of freedom. The density function of the F-variate at the point 
k is 


—}(N—2) (prt) 
= OPy —p~1, p+! /ak = V : — (p + 1) _ 
Ee 


ciated a egl s p+1__| 
(5.7) pos | § = pit oy 2th leh ae 


fo = 1 hor 


i 


wa~p-! et! 


an -(N-p- 1)#°-?-) (p+ 1 hor 
B( 2 “® ) 


phr-e-b-1 
((N — p — Dk + (p + DI’ 
after some reduction. 
The case discussed in Section 4 corresponds to p = N — 2. 


6. Probability contents of symmetrically and asymmetrically located hyper- 
spherical cylinders. A hyperspherical cylinder in N-space is one such that the 
intersection with the cylinder of a (N — 1)-flat perpendicular to the axis of the 
cylinder is a hypersphere. 

There are two distinct cases to consider: 

(a) The axis of the cylinder passes through the center of the distribution. 

(b) The axis of the cylinder does not pass through the center of the distribution. 

The probability content may in both cases be readily evaluated by taking 
sections perpendicular to the axis. Let a be the radii of the cylinders in both (a) 
and (b), and let be the distance between the axis of the cylinder and the center 
of the distribution in (b). The probability contents of elements formed by 
adjoining parallel (N — 1)-flats distant x and x + dz from the center of the 
distribution perpendicular to the axis of the cylinder is seen directly to be 
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(2x) exp (—$2") dxF y_,(a’) and (24) exp (— ha”) dxGy_s,,(a’), respectively. 
Hence, by integration of x over (— «, ), the probability content of the cylinder 
in case (a) is Fy_,(a’) and that of the cylinder in case (b) is Gy_s.,(a’). A par- 
ticularly simple application of (a) relates to the distribution of the sample 
variance in normal samples when the cylinder in question has its axis along the 
line 2) = 22 = «++ = 2y ((3], p. 238). 


7. Probability content of a centrally situated ellipsoid. The problem treated 
in this section is equivalent to that of finding the distribution of the weighted 


sum of squares of mutually independent standardized normal variates. Formally, 
we require 


N 
(7.1) Pry.e,,--ey(t) = P (> agi < ‘), a; 
1 


where the z; are the variates referred to. The center of the ellipsoid coincides 
with the center of the distribution and the lengths of the semi-axes are 
(t/a;)*(4 = 1,2,---,N). The axes are oriented along the coordinate axes. 

The distribution of >> az? has been discussed by Bhattacharya [20], Robbins 
[21], Robbins and Pitman [22], Hotelling [23], Gurland [24], [25], Pachares (26) 
and by Grad and Solomon [27]. The latter authors have tabulated Fy,.,....,.y(t) 
for N = 2, 3, and for various selected sets of (a; , az) and (a; , a2 , a3). We shall 
here obtain an integral recurrence relationship, based on the method of sections 
usec previously in this paper, which should enable a systematic extension to be 
made of the available tables to values of N > 3, at least for moderate N*, as well 


as of the tables for N = 2, 3. The following additional remarks are pertinent: 
(i) There is no loss of generality in assuming >> a; = 1, since this can always 
be achieved by suitable standardization. However, the weights a; are all non- 
negative. 
(ii) The important statistical problem of the distribution of the weighted sum 
of independent x* variates may be considered as a special case of our problem. 
Specifically, if y = ei c; u; where the u; are independent x’ variates with n, de- 


k 


grees of freedom, |i; n; = N, then since N independent standardized normal 
variates, 2; ,22,°** ,Zy, may be introduced so that 


es 
- 2 
(7.2) uu; = De tatant- teats 


y may be expressed in terms of the z, in the form 


& nn; 
y= >: PCE bins bh, bs 


tml jal 
N 


= >) ate, a=c¢ for a=m+mt+---+n4+j 
a=~l 


al ea (j = 1,2, +--+, m4). 


‘ Extension of the Grad and Solomon tables for N = 2 and N = 3 has now been effected 
by Professor H. Solomon and the present author with the aid of (7.10). It is hoped to publish 
the extended tables shortly. 
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(iii) The problem of determining the distribution of a definite positive quadratic 
function of N variables when the latter are distributed as in a non-degenerate 
multivariate normal distribution reduces easily to our problem. Geometrically, 
the probability content of a given ellipsoid is required when the surfaces of con- 
stant density of the normal distribution are those of homothetic ellipsoids. A 
rotation of the coordinate axes and subsequent scaling converts the latter ellip- 
soids into spheres, while the given ellipsoid will in general remain an ellipsoid 
under these two transformations. Finally, a rotation of the new coordinate axes 
to bring them into coincidence with the axes of the given ellipsoid is effected. 

Formally, one desires to evaluate the quantity 


(7.4) P(wAx < ¢) = (2x) |v [4 [ exp(—ix'V"'x) dx, 


x Axsc?* 


in which A and V are each of rank N. Set x = LRy, where V is decomposed by 
triangular resolution (as in the introductory section) in the form V = LL’, L 
being a lower N X N triangular matrix, while R is the orthogonal matrix of the 
characteris**: vectors of the matrix L’AL. Then, after substitution, 


(7.5) P(x'Ax < ¢) = (2) | exp(—4y’y) dy, 


y'ayse? 


where the diagonal matrix A is given by A = R’LALR. 
If the diagonal elements of A are denoted by A; , Ax, «~~ , Aw , the characteristic 
numbers of L’AL(A; > 0,7 = 1, 2,--- , N), equation (7.5) is equivalent to 


N 
P(x'Ax Sc) =P (x: yi S é), 
1 


in which the y,; are mutually independent normal variates with zero means and 
unit variances. This establishes the equivalence of the problem dealt with in this 
subsection with that of equation (7.1). 

(iv) A similar argument is applicable to the situations in which A is semi- 
definite positive. Here one wishes to evaluate the probability content of an 
elliptic cylinder under spherical normal distributions. The latter is clearly equal 
to the probability content of the ellipsoid, relating to an appropriately chosen 
subspace, obtained by projection into the latter subspace. The dimensionality 
of the subspace is equal to the rank of A. A case in point is the mean square 
successive difference 6{1) , defined by 


N—1 


(7.6) 2(N — 1)8 = x Ax = > (tins — 2;)’ 
tl] 


({28], [29]), for which A is the continuant 
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FB: <emed 
i~l 2 —1 


-1 2 =-1 


-1 2 -1 
—1 1 
empty spaces denoting zeros. The mean square successive difference has been 
proposed as a suitable estimator of variability when a secular trend in the mean 
is suspected. The inequality x’Ax < c’ defines the interior and boundary of an 
elliptic cylinder with axis z; = z, = --- = 2y equally inclined to the coordinate 
axes. The secular equation A — AI = 0 has one zero and N — 1 positive roots 


(7.7) Aj = 48in’(jx/2N) (j = 1,2,---,N—1), 


whence 


N—1 
P[2(N — 1)8) Sc] = (> yi s ¢), 


in which the y; are mutually independent standardized normal variates. Note 
that the sth cumulant of 2(N — 1)821 is (>-1* Aj)2” "(8 — 1)!, by the additive 
property of cumulants. The sums of powers of the characteristic numbers of A 
required for the specification of the cumulants of 871) , may be expressed in terms 
of the minors of A, using well-known results relating to symmetric functions of 
the roots of a polynomial equation or, alternatively, by direct summation of the 
finite trigonometric series >|)‘ sin” (jx/2N ), after expressing the powers of the 
trigonometric ratios in terms of trigonometric ratios of multiples of the angles 
by standard formulae. 

Similar results apply to higher order successive differences, useful in eliminating 
the inflationary effect of suspected given polynomial trends on estimates of 
variability [30], [31]. The mean square kth order advancing difference 8%.) is de- 
fined as 


2k\ .2 S ak \t 
(N — k) k du) = 2 (A’z;) 
(78) a (k = 1,2,---,N —1). 


-5[E 0-0 Gs] 


The matrix of the quadratic form involved in the definition of 67.) is a continuant 
of order k in the sense that all the elements other than those in the leading 
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diagonal and in the secondary upper and lower k diagonals are zero. The matrix is 
of rank N — k and 


» y , (2k\.2 2 ls 2 2 
P((N —k) () ji» Se) =F 2X wi Se 


where the \; are the N — k non-zero characteristic numbers of the matrix and 
the y; are mutually independent standardized normal variates. The distribution 
of 5{2) has been considered in some detail by Kamat [32]. 

(v) One additional application is worth noting. Consider a dynamic program- 
ming or multifactorial design set-up in which the optimal course of action is 
represented by the N-dimensional vector x*. Suppose that x* is not known 
exactly due either to a penumbra of vagueness surrounding the model from which 
x* is deduced (i.e. faulty or imperfect theory), or to the fact that x* is predicated 
on past experience (i.e. limited sampling), or for some other reason. Denote the 
estimate of x* by x*, and let the expectation vector and variance-covariance 
matrix of x* be x* and V(x*)(V(z*) of full rank). A course of action x will be 
adopted aiming to approach x as closely as possible to the assumed ideal course of 
action %* (not to x* which is unknown). Due to imperfect control of the action 
variables exact coincidence is not possible. Assume that the expectation vector 
and variance-covariance matrix of x are X* and V(x), respectively (V(x) of full 
rank). Then d = x — x* = (x — x*) + (Z* — x*) has zero expectation vector 
and variance-covariance martix V(%*) + V(x), provided the two kinds of errors 
are uncorrelated. Let the loss function due to imperfect matching of x with x* 
be the quadratic d’Ad, | A| > 0, and assume further that x, %* and therefore d 
have multivariate normal distributions. In view of the discussion in (iii), it is 
clear that the probability of the loss not exceeding a given upper bound ¢’ is 
equal to P(>-7 4; yj S c*), in which the y, have the usual significance. (In par- 
ticular, the expected loss is >-! \, .) The reader is referred to Grad and Solomon 
[27] who discuss an analogous ballistics problem for which N = 3. 

(vi) There is one interesting case for which the distribution of the weighted 
sum of squares may be expressed in exact form. If the number of components 
N is even, N = 2m, and the weights c; coincides in pairs, say c; = cy_; , then 


N m 
2 
= 2 Ci} = > cw; 
j=l j=l 


where the y; are independent x’, each with 2 degrees of freedom. The character- 
istic function of > cjy; is 1(1 — 2cjit)~. By partial fraction decomposition the 
latter may be expressed in the form >> A,/(1 — 2c,it), which is obviously di- 
rectly invertible to }> (A;/2c;) exp (—2z/c;), the density function of z. It 
follows that the complement of the distribution function of z, P(z > 2), is like- 
wise expressible as a linear combination of exponentials. 

The above remarks may easily be extended to the situation where the weights 
are repeated in groups of four (instead of groups of two), groups of six, etc., i.e., 
to the situation where z = >> cjz} may be identified as a weighted sum of inde- 
pendent x’ variates, each with the same even number of degrees of freedom. 
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More generally still, the degrees of freedom of the components, though still even, 
need not be the same. A partial fraction representation of the characteristic 
function enables the density function to be expressed as a linear combination of 
Gamma density functions with degrees of freedom 2, 4, --- , p, where p is the 
highest degree of freedom of the several components. It follows that the dis- 
tribution function of the sum is a linear combination of Gamma distribution 
functions with degrees of freedom 2, 4, --- , p. 

We now obtain the recurrence relationship referred to at the beginning of this 
section. Note first that the intersection of the flat ry = z with the N-dimensional 
ellipsoid 5°? az? S 1 is itself an elliposid but of dimensionality N — 1 and with 
semi-axes of lengths ((t — ayx’)/a;)', i= 1,2,---,N — 1. The amount of 
probability within the ellipsoid intercepted by two parallel and adjoining flats 
ty = zand zy = x + dr is therefore 


t — ayz* 
(24) exp(—42*) dz - Festa] Da, | 
i 


4 


where b; = a;/ >-)*a;, 0 < ay < 1(i = 1, 2,---,N — 1). Consequently, 
the probability content of the ellipsiod is 


Py.e,,..-0y(t) 


(7.9) (t/ay)8 a 
os 2 | (99) P yas, ..-- dys (( dr (N = 2, 3, cee ), 
0 


1 — dy 
or, on setting y = z(ay/t)', 


P y-0;,---,ay(t) 
il ~ 
(7.10) = 2 (+) [ (2x) exp (5*) 


2 
Prettye-tna| SD | ay (N = 2, 3, -++)® 
a 


We may note that for the particular case of N — 1 equal components, 


4 i a a 
(7.11) FPrwie,...ap(t) = 2 (5) [ (2) exp (= Fy; [a ay, 


in which Fy_;(-) denotes the distribution function of a x’ with N — 1 degrees 
of freedom, and (N — lja + 8 = 1. 

Finally, it will be convenient to record here an interesting relationship between 
the distribution of the weighted sum of squares of two independent standard- 
ized normal variables and that of the non-central x’ with two degrees of free- 
dom. The relationship® in question is 


“ (7.10) is, of course, just a convolution formula. It has been obtained here by a geo- 
metrical argument for consistency. 
5 I am indebted to one of the referees for having brought this useful result, for which an 


unpublished geometrical proof has been obtained by Dr. David C. Kleinecke, to my at- 
tention. 
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(7.12) F'a,0;.0,(t) = Go,«(u’) 7 Go, u(x’), 
where 


« = 4] (t/a;)' — (t/a)'|, w= Af(t/a,)' + (t/a2)4), 


and G:,,.(-) is the distribution of x}.,, the non-central x* with two degrees of 
freedom and non-centrality parameter «x, as defined in Section 3. 


8. Probability content of a regular simplex. Denote the probability content of 
a regular N-dimensional simplex with sides of length a and centroid at the center 
of the spherical normal distribution with the same dimensionality by Ky(a). To 
derive a formula for Ky(a) it will be convenient to divide the simplex into N + 1 
(non-regular) simplices, obtained by joining the centroid to the N + 1 vertices 
by straight lines. 

Consider then one of these N + 1 derived simplices S. The probability content 
of this simplex may be obtained by first evaluating the amount of probability in 
a slab formed by two adjacent (N — 1)-flats parallel to the face opposite to 
that vertex of S which coincides with the centroid of the original simplex. Let z 
and z + dz denote the distances of these flats from the latter vertex. The inter- 
section of the first flat with S is a regular simplex of dimensionality N — 1 and 
with its edges of length y, where y/a = x/d, d denoting the distance of the 
centroid of the original simplex from one of its faces. Furthermore, the density 
at a point on the same flat distant ¢ from the centroid of this regular simplex is 


(2) exp (—4r°) = (24) exp (—42").(24) *”™ exp (—42’), 


where r’ = z’ + £ is the square of the distance of the centroid of the original 
simplex from the point in question. It follows that the probability content of the 
slab is (2%) exp (—4}2") drKy_,(ax/d), and the probability content of the 
original simplex is 


(8.1) Ky(a) = (N + 1) [ (2x) exp(—42")Kw-1 a) dz. 


It is easily shown that 

(8.2) d = a/(2N(N + 1))'. 

Substituting for d in (8.1), the desired integral recurrence relationship is ob- 
tained: 


a/(2N(N+1))4 


‘ i =} ee 
pe Ky(a) = (N + 1) [ (2r) exp(—42’) 


Ky-slz(2N(N + 1))"} dz (N = 1,2, ---), 
or, equivalently [34]°, 
= (2r)”'*(m + 1) *K,,(z). The K-function, from a theoretical point of view, seems to be 


more convenient and natural than the G-function, since it is (unlike the latter) a distribu- 
tion function, in the usual statistical sense. 


® Godwin actually uses functions G,,(-) which are related to the K-functions by Gn(z/+/ 2) 
5 
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Kx(a) = (GE - =) [ 


(8.4) ; 
“exp (- wore) Kew) du (N = 1, 2, ++), 


with Ko(a) = 1. 

We shall give now one application involving a knowledge of Ky(a). This 
relates to the problem of determining the distribution of the sum (or mean) of 
N independent observations from a half-normal distribution, or equivalently the 
sum (or mean) of the absolute values of observations from a normal population. 
The first four moments of this distribution have been obtained by Kamat [33] 
for N = 3, but the actual distribution for general N does not appear to have 
been obtained previously. 

The density function for each observation is (2/x)* exp (—2z*/2)(0 S x < @), 
and the joint density function of N independent observations is 


2\" N ; 
(8.5) (2) exp (- } p> 2) (x 20, ¢ = 1,2,---,N). 
The determination of the density function of u = >} 2, thus reduces to the 
determination of the probability intercepted by the (NV — 1)-flats (7 2, = u 
and >-} z; = u + du in the positive orthant. To obtain this, note that 
dT 2, = u, 2% 2 O(i = 1, 2, --- , N) defines a regular (N — 1)-dimensional 
simplex with edges of length u+/2. The distance of the flat }-! 2; = u from the 
origin (i.e., the distance of the latter point from the centroid of the simplex) is 
u/-V/N. Further, the density at any point within the simplex distant 7 from the 
centroid may be expressed in the form 


m1) - fool ee) 


l 2 
apie o(-WW) Te-am (- $5). 
Consequently, the probability content of the element u < >, z, S u + duis 


(8.6) hy(u) du = Je (- in) a Ky(uv/2), 


after integration with respect to » over the simplex, where hy(u) denotes the 
p.d.f. of u. Observe that equation (8.6) reveals at the same time the intimate 
tie-up between the Ky_,-function and the N-fold convolution of the half-normal 
distribution. This tie-up was first demonstrated by Godwin [35], using an entirely 
different argument, in showing the equivalence of two expressions for the p.d-f. 
of the mean absolute deviation in normal samples, obtained respectively in [36] 
and [37]. 
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We conclude with two further important applications of the K-function. The 
first relates (as already indicated) to the distribution of the mean absolute devi- 
ation in samples of size n from a normal population with zero mean and unit 
variance. The p.d.f., p(t), of the latter variable is given by 


i a—l 32 
_ (n = m\ 17, — o net S 
paid) = (5) #9 (;) bin — ren | - gee] 


1. (5) =~(4) 


(Godwin [36]). Tables of {5 p,(t) dt for n < 10 are available in [38], while per- 
centage points of the distribution are given in [38] and [39] (Table 21, p. 165). 

The second application relates to the distribution of the deviation of the 
largest obser ation from the mean in a sample of n independent observations 
from a norm, | population with zero mean and unit variance. The distribution 
function, Q,(+), of the latter variable is given by 


(8.8) Q(t) = Ky+(ntv/2) 


(Nair [40]). Tables of Q,(¢) are given in [40], while percentage points of the 
distribution are available in [40] and [39] (Table 25, p. 172). 

It should be noted that the K-functions also find applications in connection 
with the distributions of a class of linear functions of normal order statistics 
[40}’. 


(8.7) 
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GRAPHS FOR BIVARIATE NORMAL PROBABILITIES 


By Marvin ZELEN AND Norman C, Severo! 
National Bureau of Standards 


1. Introduction and summary. Recently there has been much activity dealing 
with the tabulation of the bivariate normal probability integral. D. B. Owen [3], 
[4] has summarized many of the properties of the bivariate normal distribution 
function and tabulated an auxiliary function which enables one to calculate the 
bivariate normal probability integral. In addition, the National Bureau of Stand- 
ards {1} has compiled extensive tables of the bivariate normal integral drawn 
from the works of K. Pearson, Evelyn Fix and J. Neyman, and H. H. Germond. 
In this same volume, D. B. Owen has contributed an extensive section on ap- 
plications. 

It is the purpose of this paper to present three charts, which will enable one to 
easily compute the bivariate normal integral to a maximum error of 10-*. This 
should be sufficient for most practical applications. Owen and Wiesen [5] have 
also presented charts with a similar objective; however, as pointed out below, 
we believe the charts presented here lend themselves more easily to visual in- 
terpolation. Actually the motivation for these charts came from the Owen and 
Wiesen work. 


2. Notation and formulas. We present here notation and useful formulas re- 
lating to the bivariate normal integral. Let X and Y be random variables follow- 
ing a bivariate normal distribution with zero means, unit variances, and correla- 
tion coefficient p. Then 


(1) PriX >h,Y =k} = Lih,k;p) = [af g(z, y; p) dy, 
a he 
where 
g(x,y; p) = [2ev/1 — pl exp — (a — pry + y’)/(1 — p’)) 


is the bivariate normal probability density function. 
Useful relations for L(h, k; p) are set out below: 


(2) Lth, k; p) = Lk, h; p), 


n k 
(3) L(—h, —k; p) -[ dz [ g(x, y;p) dy, 


h 
(4) L(—h, ks») = [ dz [ o(z,;0) dy, 
“x k 
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oo k 
L(h, ~k; —p) - | dx [ g(x, y; p) dy, 


a k 
(6)  2L(h, k;p) + L(h,k; —p) + P(h) — Q(k)] = [@ [ot y; ) dy, 


(8) L(—h, —k; p) — Lth, k; p) = P(k) — Qth), 
where 


P(x) = (2r) hf e"? dt = 1 — Q(z). 


Special values of L(h, k; p) are 
(9) L(h,k;0) = Q(A)Q(k), 
(10) L(h, k; —1) 0 
(11) L(h, k; —1) = P(h) — Q(k) 
(12) L(h, k; 1) Q(h) 
(13) Lith, k; 1) Q(k) 
(14) L(0, 0; p) = 4 + ((aresin p)/2x). 


3. Discussion of Charts. Owen [4] has shown that 


2 sdk : ih to) (« ee) 
(15) Lh, k;p) = L (4,0; A= aes +L k, 0; Gs — Sone + 


(0 if hk >O or Ak =O and h+k20 


4 otherwise 


This makes it possible to evaluate L(h, k; p) as a function of L(h, 0; p) which 
only depends on two parameters. Figures 1, 2, and 3 are plots of h versus p with 
constant contour lines such that L(h, 0; p) = 0.01(.01).10(.02).50. 

Owen and Wiesen [5] have given charts plotting L(h, 0; p) versus h with con- 
stant contours for p. The advantages of the charts presented here are that (i) 
the contour lines more fully cover the available graph space making interpolation 
easier and more accurate, and (ii) the Owen and Wiesen charts require visual 
interpolation on the p contour lines which could easily lead to errors larger than 
+.01 in reading L. On the other hand, figures 1, 2, and 3 require only visual 
interpolation on L between successive contour lines differing by .01 or .02. Hence 
interpolation errors are at most of the order +.01 throughout. 
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WIAA TATA TAZ 
AITAITALAIV AAA 
AIVYI VIA TI AYIA 
7 A YI JA | YI LA A“ 
pYI VI IAT VI A 


Fio. 1. L(h, 0; p) for0 S h S 1 and —1 S p S O. Values for h < 0 can be obtained using 


4. Applications of the charts. 
EXAMPLE 1. To find L(0.5, 0.4; 0.8). Using (15), we have 
(hk? — 2phk + k*)* = 0.3 
L(0.5, 0.4;0.8) = L(0.5,0;0) + L(0.4, 0; —.6) = 0.15 + 0.08 = 0.23. 


The correct answer to 3D from [1] is L(0.5, 0.4;0.8) = 0.233. 

EXAMPLE 2. Let X and Y follow a bivariate normal distribution with means 
and variances m, = 3, m, = 2,02 = 16,0, = 4 and correlation p = —0.125. To 
find the value of PriX 2 2, Y 2 4}. Since 


PriX 2h, Y 2K = Uh — m,)/oz, (k — m,)/o, 5 pl, 





AVY 
CEA 

TV IAP UT IY 

VIA VIA Vi Aly 
AAT 
LIA VIA IA VI Yl 
Kk |A_| SA AAA 


PAH 
DOA 
VYi VI VI Yi Vi Yi i lA 
AIA IAAL AT A A YY 


SSK 
ac 


gaxeen<eu 


 S 


ANNO 


ee 
PIV VTA 
IZA LA Yb Zl VA” 


10 86S 25 .30 35 40 AS ~ 3s 40 45 #70 


Fic. 2. L(h, 0; p) for0 Sh S landO S p S 1. Values for h < 0 can be obtained using 
L(h, 0; —p) = 4 — L(—h, 0; p) 


We have Pr{X 2 2, Y 2 4} = L(—0.25, 1.0; —0.125). Therefore using (15), 
L(—0.25, 1.00; —0.125) = L(—0.25, 0; 0.969) + L(1.0, 0; 0.125) — 4. The 
charts only give values for h > 0; however using (7) with k = 0 we have 
L(—h, 0, p) = 4 — L(h, 0; —p). 
Hence L(0.25, 0; .969) = 4 — L(0.25, 0; —.969) and thus 
L( —0.25, 1.0; —.125) = —L(0.25, 0; —.969) + L(1.0, 0; 0.125) 
= —0.01 + 0.09 = 0.08. 


The correct answer to 3D from [1] is L( —0.25, 1.0; —.125) — 0.080. 
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L(h, 0; —p) = 4 — L(—h, 0; p) 
Examp te 3. To find the value of 
A az 
V(h,ah) = (2) +f dx feu dy 


when a = 2andh = 0.5. 
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The function V(h, ah) is sometimes known as Nicholson’s function [2] and is 
useful in finding integrals taken over polygons with respect to the circular nor- 
mal distribution. The relation between V(h, ah) and L(h,0; p) can be shown to be 


Vth, ah) = 5 + L(h, 0; p) — L(0,0; p) — 4Q(h) 


where p = —(a/(1 + a’)'. Hence for this example 


V(0.5, 1.0) = 0.25 + L(0.5, 0; —.894) — L(0,0; —.894) — 3Q(0.5) 
0.25 + 0.01 — 0.07 — $(.29) = 0.05 
The correct answer to 3D from [1] is V(0.5, 1.0) = 0.047. 


5. Acknowledgments. We would like to acknowledge the help of David 8. 
Liepman and Miss M. Carroll Dannemiller who carried out the computations 
for the charts. 
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CHARTS OF SOME UPPER PERCENTAGE POINTS OF THE 
DISTRIBUTION OF THE LARGEST CHARACTERISTIC ROOT' 


By D. L. Heck? 
University of North Carolina 


1. Introduction. In multivariate analysis, the largest characteristic root of 
certain matrices of sample quantities, or a simple function of this root, provides 
a statistic for testing (i) independence between a set of p correlated variates 
and a set of g correlated variates in a (p + q)-variate normal population and 
(ii) the general multivariate linear hypothesis, assuming multivariate normal 
populations. Likelihood ratio methods for dealing with these tests have also 
been advanced by Wilks [29] and Bartlett [4], and comprehensive accounts of 
the use of these techniques are given by Wilks [30], Rao [17], and Anderson [2}. 

Procedures based on the largest characteristic root were proposed by Roy 
[19, 20, 23] for not only testing (i) and (ii), but also for obtaining confidence 
bounds on parametric functions associated with both cases. These procedures 
require the c.d.f. of the largest root, which was given in terms of a chain of re- 
cursion formulae by Roy [19] and Nanda [12] who started from the joint sam- 
pling distribution of the roots obtained earlier by Fisher [6], Girshick [10], Hsu 
{11], and Roy [18]. This distribution of 6; (¢ = 1, 2, --- , 8), the s non-zero roots 
obtained under the null hypothesis in (i) and (ii), is given by 


p(O; , 4, im. ,,) I] dé, 


r°y[r (= + 2n petit y TY orcs — 0° To. - 0) TT a 


i=] 


‘ 2m +i+1 2n+i+1 
Ir ( 2 )r( 2 )r(5) 
0<45-:'56<1, 
where 6; and the parameters s, m, and n assume different values, depending upon 


the hypothesis being tested. 
For example, in (i) the @,’s are the s non-zero roots of the (p X p) matrix 


Sw Sz in 9 where 
Su Si Pp 
Sis So q 


i Rr at Pp q 
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is the sample covariance matrix based on N — 1 degrees of freedom. The param- 
eters are given by s = min (p,q), m = (|p — q\| — 1)/2, and 


n= (N — p-—q- 2)/2. 


In (ii) and the special case of testing for equality of the (p & 1) mean vectors 
of k groups in one-way classification multivariate analysis of variance, 6; = 
c;/(1 + ¢;), where the ¢,;’s are the s non-zero roots of the (p X p) matrix 
BW". The elements of the (p X p) matrix B consist of the “between groups” 
corrected sums of squares and crossproducts of the p variates, while W is the 
corresponding matrix for “within groups’”’. With n; observations on the p variates 
in each of the k groups, s = min (k — 1, p), m = (\|k — p— 1| — 1)/2, and 
n= (Dijainy — k — p — 1)/2. 

The tests are carried out by computing @, and then comparing this statistic 
with the appropriate 100a percentage point of the distribution of the largest root. 
Further applications of these percentage points for testing purposes have been 
given by Bargmann [3], Chaudhuri [5], Fosterand Rees [7], Pillai [15], Roy (20), 
(23], Roy and Bargmann [25], and Roy and Roy [27]. In addition, the use of the 
points in setting up multivariate confidence bounds is discussed in [3], [21], {22}, 
[23], [24], [25], [26], [27]. 

The purpose of this paper is to present, in chart form, some upper percentage 
points of the distribution of the largest root for a wider range of the parameters 
than has heretofore been considered. One of the most extensive tabulations to 
date is that by Pillai [15], giving the upper 1% and 5% points and covering the 
range s = 2(1)5, m = 0(1)4, n = 5(5)40(20)100(30) 160, 200, 300, 500, 1000. 
Also, the upper 1% and 5% points for these same values of m and n have been 
obtained by Pillai and Bantegui [16] for s = 6. Other tables include Nanda’s [13], 
the upper 1% and 5% points for s = 2,m = 0(4)2, n = 4(4)10; Chaudhuri’s 
[5], the upper 1% and 5% points for s = 2,m = n = 24(4)5(1)11, for s = 3, 
m =n = 24(4$)5(1)8, and for s = 3, m = 0(4)2, n = 43(4)2; Foster and 
Rees’ [7], the upper 1%, 5%, 10%, 15%, and 20% points for s = 2,m = —4}, 
0(1)9, n = 1(1)19(5)49, 59, 79; and Foster’s [8, 9], the upper 1%, 5%, 10%, 
15%, and 20% points for s = 3, 4,m = —4(4)3, n = 0(1)95. From Table 
4.1 and the charts in Section 3, the upper 1%, 2.5%, and 5% points may be ob- 
tained for s = 2(1)5,m = —4, 0(1)10,n 2 5. 


2. Computation of the percentage points. The charts in Section 3 were pre- 
pared from percentage points which were computed using two types of approx- 
imations to the c.d.f. of the largest characteristic root. The first type of ap- 
proximation, obtained for s = 2, 3, 4, 5 by Pillai [14] was used to compute, in 
general, the points for integral n S 100. For large values of n, generally n > 100 
asymptotic approximations based on Pillai’s formulae were used which were ob- 
tained by Whittlesey [28]. 

To compute the percentage points from Pillai’s approximations, denoted by 
p.(x, m,n), the value of p,(z, m, n) for a particular combination (s, m, n) was 





CHARTS OF ROOT DISTRIBUTION 627 


first calculated at the 100 values of z from .01 to 1.0 at intervals of .01. On the 
resulting ordinates, a method of inverse interpolation was used to obtain the 
upper 1%, 2.5% , and 5% points, i.e. x, such that 


pP.(La,m,n) =l—a (a = .O1, .025, .05). 


The overall computational procedure for each value of s was as follows: For a 
fixed integral m and an initial (small) n, the percentage points were computed; 
n was then stepped up by unit increments, with the percentage points being com- 
puted for each value of n, until the desired set of values of n was covered. Then 
the expression was modified for the next integral value of m, and the percentage 
points for this value of m were computed for all desired n. This procedure was 
continued to m = 10, which is a fairly large value for practical purposes. 

As a check on the accuracy of these percentage points, a number of the points 
were substituted in the expression for the exact c.d.f., and the largest error which 
occurred was found to be less than two units in the fourth decimal. 

Whittlesey’s asymptotic approximations (for integral values of m) may be 


obtained from Pillai’s approximations by using Stirling’s approximation and the 
substitution 


(2.1) z= —(m+ 2n+8+ 1) log (1 — 2), 


and then letting n become large. From the resulting expressions, denoted by 
w,(z, m), inverse interpolation was used to obtain z,(8, m)(or z,.) such that for 
fixed s and m, 


W(Za,m) =l—a (a = O01, .025, .05). 


, 


From these ‘‘asymptotic” z,(s, m) values, given in Table 4.1, the percentage 
points 2_(8, m, n) were obtained by inverting (2.1). 

A group of the percentage points obtained from Whittlesey’s approximations 
was checked by substitution in the expression for the exact c.d.f., and of those 
points used in the final tabulation, the error for the most unfavorable combina- 
tion of s and m (s = 5,m = 10) was found to be five units in the fourth decimal. 
This error, which is primarily an error of asymptotic approximation, is consider- 
ably smaller for smaller values of s and m, and, because of the asymptotic nature 
of the approximation, decreases in all cases, for increasing n. 

Computation of the percentage points and the z,(s, m) values was carried out 
on the IBM 650, with the programs coded in The Bell Interpretive System [31). 
The program of the exact c.d.f. of the largest root, which was used for checking 
purposes, was coded in DOPSIR [1] (for s = 2(1)6, m = 0(1)10, and integral 
n 2 0), and is available at the North Carolina State College IBM Laboratory, 
The Institute of Statistics, Raleigh, North Carolina. The computation of the 
points for m = —4 was done subsequent to the computation for integral valued 
m, and Pillai’s and Whittlesey’s approximations were again used, after appropri- 
ate modifications were made. 
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3. Charts of the upper 1%, 2.5%, and 5% points of the distribution of the 
largest characteristic root. 


3.1. Description. Charts I~XII enable finding z.(s, m, n) such that 
P(@, S za(8,m,n)| = 1 — a, 


where 9, is the largest non-zero root. On each page, the graphs appear for a par- 
ticular s anda (s = 2(1)5, a = .O1, .025, .05) form = —4, 0(1)10 andn from 
5 to 1000. The curves corresponding to the twelve values of m on each page are 
in two sections, the lower section being the continuation of the upper section, 
with an overlap occurring from z_. = .50 to .55. Of the two scales for z, at the 
bottom of the page, the upper scale corresponds to the upper set of curves and 
the lower scale to the lower set. The lowest curve in each case (with the excep- 


TABLE 4.1 
Values of z.(8, m) 


-OS 


12.1601 10.1465 8.5041 | 17.1762 14.9006 13.1141 
14.5680 12.4157 10.7393 19.5012 17.1192 15.2389 
18.7346 16.3599 14.4873 23 .6906 21.1262 19.0866 
22.4664 19.9086 17 .8762 27.5181 24.7971 22.6216 
25.9526 23 . 2352 21.0641 31.1203 28 . 2507 25.9635 
29.2755 26.4145 24.1192 | 34.5647 31.5768 29.1708 
32.4795 29.4870 27.0779 + | 37.8905 34.7848 
35.5920 32.4773 29.9628 41.1230 37.9071 
38.6311 35.4018 32.7886 | 44.2795 40 .9597 
41.6098 38.2722 35.5658 47 .3726 43 .9542 
44.5375 41.0970 , | 60.4118 46 .8993 
47.4215 43 .8827 49.8017 


s=4 


23.5278 ‘ , 

27.1543 ‘ d 31.5731 
30.5096 ' 35.0938 
33.9135 ‘ 38.4880 
37.1265 | ' 41.7829 
40.2588 | ‘ 47.7639 44.9971 
43.3246 | 50.9876 48.1441 
46.3345 | ; 54. 1508 51.2340 
49.2064 . 57 .2615 54.2745 
52.2166 | , 60.3264 57.2717 
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tion of Chart III) corresponds to m = —}, the next lowest to m = 0, the next 
to m = 1, ete., to the uppermost curve, which corresponds to m = 10. The scale 
for n is on the left margin of the page and is logarithmic. 

3.2. Note. The values of z,.(s, m,n) may be read from the charts correct to two 
decimals. For a more precise value, when n > 100, the method described in 
Section 4 is suggested. 


4. Asymptotic z,(s, m) values. 


4.1. Description. In Table 4.1 the values of z.(s, m) are listed fors = 2(1)5, 
m = —4,0(1)10, and a = .01, .025, .05. For n > 100, these may be used to 
obtain z,(8, m,n), with an error of at most five units in the fourth decimal. Fora 
given combination (s, m, n) and a desired significance level a, determine 
x = 2,(8, m,n) from (2.1) with z = z,(s, m) obtained from Table 4.1. 


5. Acknowledgments. I should like to express my sincere thanks toS. N. Roy 
and R. E. Bargmann for their helpful advice and assistance in the preparation 
of this paper. 
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CORRECTION FOR BIAS INTRODUCED BY A TRANSFORMATION OF 
VARIABLES' 


By Jerzy NeEYMAN AND Euizapetu L. Scorr 
University of California, Berkeley 


1. Introduction. The problem of “normalizing” transformations has two dif- 
ferent facets: one is concerned with the identity of transforming functions suit- 
able for variables following a distribution with a particular shape or properties 
and the other with the nature of the statistics capable of serving as unbiased 
estimates in cases where a given transforming function appears to be successful. 
The literature on the first problem is rich (see, for example, [1], [2] and [8}). 
This paper is concerned with the second problem. Our purpose is to deduce mini- 
mum variance unbiased estimates of the effects of experimental treatments 
expressed in the original units. The solution is obtained for a broad category of 
transforming functions. 

The estimates of treatment effects expressed in the original units are cus- 
tomarily obtained by the inverse transformation of the estimates in transformed 
units. As is well known [1], [6], [7], this traditional estimate is biased. Occasionally, 
this bias is important. Further, the bias gains importance when a number of 
similar estimates of the same effect, obtained from independent sets of observa- 
tions, are averaged in order to estimate the average effect. The random errors 
of the particular effects tend to average out but, in general, not the bias. 


2. Statement of the problem. Our basic assumption in this paper is that the 
transformation used in the analysis of an experiment is faultless so that the 
transformed variables exactly follow normal distributions with some postulated 
means and with the same unknown variance o’. Generically, these normal vari- 
ables will be denoted by the letter ¢(¥) where y¥ identifies the expectation of the » 
variable concerned. Thus £(y) is the transformed variable in the experiment. The 
variable that is directly observable will be denoted by X(y). It will be assumed 
that 


(1) X(¥) = flé(y)), 


where f is a strictly increasing function defined for all real values of its argument. 
Later on, we shall introduce further limitations on f. It will be noticed that f 
is the inverse of the function used for transforming the observable variable X. 
For example, with the square root transformation the function f is the square 
of its argument. 

The problem treated is concerned with a particular pair of variables of the 
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family considered, namely with &(u) and X(y), where uw is the mean of &(4) and 
is a well-defined but unknown number. Specifically, we are concerned with 
estimating 


(2) 6 = E|X(u)). 


Our problem arises when the variables &(4) and X() are not directly observable. 
On the other hand, the variables that are observable in the given experiment 
yield a pair of statistics, 4 and S’, mutually independent and jointly sufficient 
for w and o’. The first is a normal variable with mean yu and variance *o", where 
* is a known number. The second statistic, S’, is the residual sum of squares 
and, divided by o’, is distributed as x’ with a certain number » of degrees of 
freedom. Our problem is to devise a function, say 6(4, S”) such that its expecta- 
tion equals @ identically in » and o’. Because of the familiar result of Lehmann 
and Scheffé [5] that the sufficient system of statistics (4, S*) is boundedly com- 
plete, it follows that the function 6(f, S*) is unique and is the minimum variance 
unbiased estimate of 6. 

Before proceeding to the construction of the estimate 6(f, S*) we give two 
illustrative examples. 


3. Example 1: An experiment in randomized blocks. Denote by a and 8 two 
unknown parameters capable of assuming values within a certain open set, and 
by &(a, 8) a normal random variable with expectation 


(3) El&(a, B)] = a+8 


and with a fixed variance o°. Correspondingly, 


(4) X(a, 8B) = flé(a, B)). 


A randomized block experiment will yield particular values of mn independent 
random variables X(a;, 8;) fori = 1, 2,---,mandj = 1, 2,---,n, with 
>>8; = 0. Here the ’s represent the familiar block effects and a, stands for the 
“transformed effect”’ of the 7th treatment in the hypothetical average conditions 
of the experiment. The analysis of the experiment ordinarily involves the estima- 
tion in original units (pounds, inches, number of surviving insects, etc.) of the 
effect of the ith treatment if it were applied in the average conditions of the 
experiment. In order that this estimate can be conveniently combined, by averag- 
ing, with similar estimates derived from other experiments involving the same 


treatment, the estimate sought should be unbiased. The quantity to estimate’ 
is, then, 


(5) 6 = E[X(a;,0)] = Elfl&(a; , 0))} 


where §(a;, 0) is a normal variable with an unknown mean a; = uw and with 


? Of course, the definition of the “‘effect of the ith treatment in the average conditions 
of the experiment’’ by means of Formula (5) is not the only possible definition of this con- 
cept. An alternative definition might be the average over j of the quantities E{f[t(a; , 8;)]}. 





TRANSFORMATION OF VARIABLES 645 


variance a’. While &(a; , 0) and X(a; , 0) are not directly observable, the experi- 
ment yields an estimate of 4, namely, 


> tu, 


(6) pa 
1 jmi 


which, according to traditional theory, is normal with mean » = a; and variance 
(7) a5 = X'o" = o’/n. 


Also, in the usual notation, 


(8) s - > > (5; — gi. re §.j + t..)’ 


tml jul 


is the sum of squares of the residuals which, combined with {, forms a sufficient 
system of statistics for » and o’. It is independent of {, and is distributed as the 
product of o” by a x’ with »y = (m — 1)(n — 1) degrees of freedom. Our problem 
is to devise a function, 6(j, S*), which is an unbiased estimate of 0. 


4. Example 2: Regression analysis of a randomized cloud seeding experiment. 
An experiment is performed to check whether the ‘“seeding’’ of clouds, intended 
to increase the precipitation in a “target area” 7, has an effect. Also, it is in- 
tended to estimate the amount of this effect measured in inches of actual pre- 
cipitation. A certain number s 2 1 of adjoining areas, presumed to be unaffected 
by seeding, are used as controls. We shall use the symbol X, to denote the rain- 
fall from a particular storm falling in the ith control area and the vector symbol 
X = (X,, X:,--- , X,.) to denote the precipitation from the same storm in 
all the controls. For each X we consider the random variable Y(X) representing 
the target precipitation in conditions when the precipitation in the controls is X 
and there is no seeding. All these variables are measured in inches. 

Now suppose that a storm, with control precipitation equal to X’, is seeded 
and yields Y’ inches of rain in the target. In order to estimate the effect of this 
seeding it is necessary to have an estimate of the rain which would have fallen 
in the target from the same storm if there were no seeding. In other words, we 
need an estimate of E[Y(X’)] = 6. The hypotheses usually made about the 
variables X and Y(X) are that, by means of some suitable change of scale, 
X; = f(&), ete., they can be replaced by transformed variables — and 7/(£), 
respectively, such that, for each £, the variable 9(£) is (approximately) normally 
distributed with a mean 


(9) E{n(€)] = ao + Do aks, 


where the a’s are unknown constants, and with a variance o° independent of £. 
With these assumptions, the quantity @ to be estimated is 


(10) 6 = Effi[n(é’)}}, 
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where ¢’ stands for the transformed value of X’. Denote by yu the expectation 
of n(’), 


(11) w= a+ DL adi, 


and by 4 its minimum variance unbiased estimate obtained from the regression 
analysis of a random sample of unseeded storms. The same analysis provides 
the sum S’ of squares of residuals, which is stochastically independent of 4 and, 
when divided by o’, is distributed as x’ with a certain number of degrees of 
freedom »v. The variance of f is \*(#’)o’, where and factor \*(£’) depends upon 
the value of ¢’ and, in fact, grows without limit when ¢’ diverges from the average 
E of the control precipitation from the nonseeded storms used to evaluate . 
Our problem consists in determining a function 6(f, S°) such that 


(12) El6(a, S*)| = 0. 


The difference between Y’ and 6(j, S’) is the estimated effect of seeding, ex- 
pressed in inches. 


5. Method and auxiliary formulas. Since the normalizing transformations are 
supposed to amount to a change in scale of measuring the observable random 
variables, it is natural to assume that the function f determining the observable 
random variable X in terms of the normal variable ¢ is fairly regular. Our method, 
to be termed the expansion method, of constructing 6(4, S’) is limited to the case 
where (i) 6 = E{flt(u)]} exists, (ii) f is an entire function, and (iii) the ex- 
pectation @ may be obtained by taking expectations, term by term, of the Taylor 
expansion of f, so that 


(13) 0 = Bisleu)l} = $0) +O +s Her(u)), 


where f‘" stands for the nth derivative of f evaluated at zero. Then, for each n, 
we determine a homogeneous combination 
(14) T. = 2 Aash'S”* 
k=O 
such that 
(15) E(T,) = Elé"(u)], 
and show that 


(16) (4,8) =f0)+> — fT, 
n=! . 


is the solution of the problem. 


Also, for functions f of a particular family, we give an alternative easy method 
of constructing 6(4, S°). Before proceeding we must recall certain formulas and 
deduce certain bounds. 
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For every m = 1, 2,---, we have 
(17) E{(S*"| = (20*)"P(4v + m)/T(4y). 
Also 
m _ = satan _(2m) L ak 2 m —k 
and it follows that 


(2m)! 


(19) 
m! 


(o°/2)" = Ele"(u)) s onl [(u? + 0”) /2)”. 
Similarly 


= 2 1)! 
(0) BW) =e SEY 


nae — 2k 2, m—k 
& (2k + 1)'(m — k)! * (o°/2)"™. 


In order to obtain convenient bounds on E(| ¢”*' |), we notice first that 


, (2 1)! 
my gn BES 


(o°/2)" < | Ele" "(u))| < BE) EP" (au) |. 


Further, by Schwarz’ inequality and because of (19), 


E\2"*"(u) | < (Ble (w)JEle"(u))}" 


\i 
S (un +o)! (oo) (ue + 0°) /2)”. 
(2m)! 


(22) 


However, it is easy to see that 


; ! (4m)!\' ss { (4k — 3)(4k — 1)! 
(9: aad oi ae = Oo) = 5) 
23) 2"(2m + oi (er) | I] (4k + 2)? an 


Consequently, we may write 


(24) |u| SRE ot)" < Bleu) | < 


Gat yw? + ol”, 


for all m. 


6. Term by term evaluation of the expectation of a Taylor series. In this section 
we use the bounds found in Section 5 in order to prove certain theorems. 

THEOREM 1. In order that the series in the right-hand side of (13) be convergent 
irrespective of the values of u and o’, it is necessary and sufficient that the radii of 
convergence of the two series 


x 


(25) yt ond + fg 
é n= 4 


n=0 
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both be infinite, so that 


li ] | fam) - = li 1) pten+n Y 
Se! |) a 
1 


= lim 1 ase~ yp = lim = geese yn at 


n-e 


(26) 


It will be noticed that conditions (26) are stronger than the assumption that 
the Taylor expansion of f, 


(27) f(g) = ase, 


is convergent for all real ¢. In fact, the conditions necessary and sufficient for 
this to happen may be written as 


aaa n) \ 1/0 : I n+l) \1/n 
(28) lim =, (fe) ee lim — (f° * = 0, 
Thus, if the radii of convergence of (25) are infinite, then that of (27) will also 
be infinite, but the converse is not necessarily true. 


In order to prove the therem we simply notice that inequalities (19) and (24) 
imply that, for all m, 
1 (2m) 2, m 1 | g(2m) | m l (2m) 2 2 m 
— ae i —* /2 
(29) ai lf | (o°/2) = (omy f | Ele? (sais (a? + 0°) /2)", 


1! 


1 


+ m 1 (2m ) ry m+1 
= | uf °"* | (0°/2)" s (om +11 !4 me BP (uw) | 


(30) . 
(Qm+1) ; 2 2)" 
Ss — lf lw +o)”. 
If we assume that the series (13) is convergent for all values of u and o’, then 


this will imply that the middle terms in (29) and (30) tend to zero as m — ~, 
In turn, this implies that, for all ’ and for sufficiently large m, 


1 Oe ae 1 (2m-+1) 7 2 


which is equivalent to (26). On the other hand, if we assume that conditions 
(26) are satisfied, then the two series (25) are absolutely convergent for all 


values of the argument and the right inequalities (29) and (30) imply absolute 
convergence of (13). 


THEeoreM 2. Under the conditions of Theorem 1, that is, under conditions (26), 


(32) @ = BUsle(u))} = $0) + D A sae) 
for all » and o. 
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In other words, if conditions (26) are satisfied then the expectation of f can 
be obtained by taking expectations term by term of the Taylor expansion of 
this function. 

Theorem 2 is implied by inequality (24) showing that in the middle part of 
Formula (30) the expectation of #”*'(u) may be replaced by the expectation of 
the absolute value | @"*'() |. 

For convenience of reference we shall adopt the following definition. 

DeFiniTion. An entire function f is called of second order if it satisfies condi- 
tions (26). 

It will be seen that every indefinitely differentiable function whose derivatives 
at a particular point are bounded is necessarily a second order entire function. 
Also, the sum of two second order entire functions is itself a second order entire 
function. 


7. Lack of complete generality of the expansion method. At this point it may 
be interesting to indicate a purely mathematical formulation of the general 
problem treated in this paper. This is as follows 

For a given positive integer v, for a given positive number d* and for a given function 
f, defined on the real line and such that 


+ 
(33) [ | f(z) [ete dr < te 


for all real u and for « > 0, to determine a function of two arguments 6(z, y’), inde- 
pendent of u and a, such that 


wae 2ig2 
2 *e'r(dy) [ f(x)e*" ae 
(34) oe 


+e 


= [ g ton [ ye b(z, y*) dy dz, 


identically in u and « > 0. 

The expansion method provides the solution of this problem when the function 
f is entire of second order. However, it is easy to construct entire functions f 
satisfying (33) which are not of the second order. One example is exp {—z’}. 
To such functions the expansion method is not applicable and we are not certain 
whether the solution of equation (34) exists. 


8. Minimum variance unbiased estimate of the expectation of a second order 
entire function. From now on we shall deal exclusively with functions f(£) which 
are entire of the second order. 

Let { be a normal variable with expectation » and variance \’s" where * 
is a known number. Also, let S* be independent of 4 and such that S’/o’ is dis- 
tributed as x’ with » degrees of freedom. Finally, forn = 0,1, 2,---, 


" 2n)! 
Seah ge 48°(1 - »*))"* te) __ 


(35) Tm = 2: (2k) in—k1* | r(4v+n—k) 
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and 


(ag al = (2n +1)! _ atk+ipy G2 2) \n—k __ Tp) ad 
(36) Tea = 2 e+ pia —mi* BSA MI Paw: 


By direct computation it is easy to verify that, for every m = 1, 2,---, 
(37) E(T,) = Elé"(u)). 


Tueorem 3. If f is a second order entire function, then 


(38) (4, S*) = s(0) + > TST. 


is convergent for all values of fj and S* and is an unbiased estimate of 
6 = E\flé(u))}. 
Comparing (35) and (36) with (18) and (20), noticing that 


(39) I'(4v)/T(4v + n — k) S (2/v)"™ 
and referring to (19), we find that 


(2m)! on 
(40) [Tn < Sy 
and, in a similar manner, 


os 8 1)! yon 
(41) [Toe < [a] ve, 


with 
(42) Y = (va? + S°(1 + d’*))/2». 


Because f is a second order entire function, it follows that the series (38) is 
absolutely convergent for all values of 4 and S’. In order to prove that the ex- 
pectation of 6(f, S*) as defined by (38) can be obtained by taking expectations 
term by term, it is sufficient to show the convergence of the series obtained 
from (38) by replacing each T,, by the expectation of its absolute value. This 
is easily accomplished by noticing that | 7, | cannot exceed the expression ob- 
tained by replacing in formulas (35) and (36) the value of 4 by that of | @ | and 
1 — 2’, which may be negative, by 1 + \*. Further computations, similar to 
those leading to (19) and (24), indicate then that 


(43) YS P 1 ET < +e. 

Because of (37), it follows that 6(f, S*) as defined by (38) has the desired 
property of being an unbiased estimate of 8. 

Formula (38) has the advantage of, so to speak, exhausting the method; 
it provides the minimum variance unbiased estimate of 6 whatever the second 
order entire function f may be. This generality is paid for by the complexity of 
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the solution provided by (38). In the next section we give a somewhat simpler 
formula for 6(f, S*) which is applicable when the function f satisfies a certain 
differential equation. 


9. Alternative solution applicable to recursive type second order entire func- 
tions. We shall say that the function f is of recursive type if it satisfies the second 
order differential equation 


(44) f(z) = A + Bf(2), 


where A and B are arbitrary constants. However, in order to eliminate the 
trivial case where f is linear, we shall assume that at least one of the constants 
differs from zero. It is easy to verify that every recursive function is necessarily 
a second order entire function. This section will be limited to consideration 
of recursive type functions f. We shall be particularly interested in the expecta- 
tion of their Taylor expansion about the point yw. Because the odd central 
moments of the normal variable are all equal to zero, we shall be concerned 
only with the derivatives of f of even order. We have, for all n, 


(45) sen'(z) = AB" + B'f(z) 
and, if B 0, 


6 = Eifle(u))} = flu) + > * (ABT + BYY(u)\(o"/2)" 


(46) 


= flue? + ' (et _ 4), 


Alternatively, if B = 0, that is, if f is quadratic, 
(47) 6 = f(u) + Ao’/2. 
Similarly, for B = 0, 
(48) Elf(a)) = fue” + (A/B)(e™"? — 1) 
and, for B = 0, 
(49) Elf(a)} = f(u) + Ad’o’/2. 
Eliminating f(u) from (46) and (48) and from (47) and (49), we find 
(50) 0 = Ph PREY (A)) + (A/B)eor"? — 1) for B ¥ 0 
and 
(51) 6 = Elf(é)] + AC — »’)e*/2 for B = 0. 


The last formula indicates that, when B = 0, the minimum variance unbiased 
estimate of @ is given by 


(52) (4, S*) = f(A) + A(1 — d*)S8*/2p. 
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If B ¥ 0 then, in order to obtain 6(f, S’), it is sufficient to determine a function, 
say (aS’, v), independent of @, such that its expectation equals exp(ac’/2). 
Taking into account the expansion 


(53) enn a 3 2 (ao*/2)", 
=) 1. 


we easily find 


2 3 = 1 r(4y) v2 /4\n 
ro #(a8',») = 2 ide +a) (aS’/4)", 


2 el 
= (a) P(4r) Ip a(SVa) 


where /,(2) is the Bessel function of inaginary argument. 
It follows from (50) that 


(55) 6(4, S*) = ®[B(1 — »*)S", oif(4) + (A/B)] — (A/B), 

which is the general formula for the minimum variance unbiased estimate of 6 
corresponding to the case where f is a recursive function. It will be seen that, 
generally, 6(4, S’) is a linear function of the traditional estimate f(A) of @, with 
coefficients depending upon S’, \’ and ». If \* = 1, that is, if the variance of 
coincides with that of (4), then 6(4, S*) = f(4). Otherwise f(4) is biased. In 
the particular case B = 0, the correction for bias is additive, as indicated in 
(52). This makes the square root transformation very convenient (provided, 
of course, it provides effective normalization!) in dealing with balanced experi- 
ments in which the quantities to be estimated are differences of certain averages, 
the estimates of which all have the same variances. 

If A = 0 but B # 0, then the correction for bias in f(4) is multiplicative. 
Finally, if both A and B differ from zero, we have a combination of a multiplica- 
tive and an additive correction. 

The importance of bias in the traditional estimate of @ may be evaluated by 
solving for E[f(f)] equations (50) and (51). We have 


(56) Elf(4)| = (0 + (A/B)je*"O""? — (A/B), for B #0 


and 
(57) Elf(é)| = @ — A(1 — 2’)o’/2, for B = 0. 


10. Some particular cases. In this section we use the general results of Section 
9 to deduce particular formulas. We obtain the minimum variance unbiased 
estimate of @ and the expectation of f(4) referring to four particular normalizing 
transformations: (i) the square root transformation, (ii) the logarithmic trans- 
formation, (iii) the angular transformation and (iv) the hyperbolic sine trans- 
formation. 

(i) In the case of the square root transformation, the transformed variable 


(58) t = (X — a)! 
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where a is a known constant. We ignore the ambiguity connected with the fact 
that, for to be a normal! variable it must be capable of assuming negative values. 
The function / is 


(59) X =f(t) = +a. 


Obviously this is a recursive function with A = 2, B = 0. Consequently, formula 
(52) yields directly 


(60) 6(f, S*) = ff) + 1 — *)S*/r = # +a + (1 — ) 8/0. 


The bias of the traditional estimate f(4) = 4° + a is obtained from (57), 
namely, 


(61) Eff(4)] = @ — (1 — d)o’. 


Hence, unless \* 2 1, so that the variance of { is at least equal to that of &(), 

the use of f(u) as an estimate systematically underestimates 6. Furthermore, 

the better the estimate fi, that is, the smaller the value of )’, the greater the bias. 
(ii) In the case of logarithmic transformation, we have 


(62) E = logy X 
and hence 
(63) X = f(t) = 10' = e™, say. 


Here again the function f is of recursive type with A = 0 and B = m’. Formula 
(55) gives 


(4, 8?) = &[m’(1 — d*) 8”, v]f(’) 

= &[m?(1 — r*)S*, v0. 
Substituting A = 0 and B = m’ into (56) we obtain 
(65) Elf(a)] = E10 = 6-9" 


(64) 


Thus, with the logarithmic transformation, the bias of the traditional esti- 
mate is multiplicative. If the variance of f is less than o’ then the use of 10 will 
systematically underestimate @ and vice versa. The bias grows with increasing 
l1 — ar? I. 

(iii) With angular transformation we have 
(66) t = aresin ~/X 


and 
(67) X = f(t) = sin’ — = $(1 — cos 2€). 


Here again the function f is of recursive type with A = 2and B = —4. Hence, 
Formula (55) gives 


(68) (i, S*) = &[4(? — 1)S*, »] (sin? 6 — 4) + 3. 
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Substituting A = 2and B = —4 in (56) we have 
(69) E{f(i)] = E [sin® A] = (@ — 4)" + 4. 

It is seen that, if the variance of { is less than o’, the traditional estimate 
sin’ f is systematically “too far” from }. If the true value of 6 < }, then sin’ f 
will tend to underestimate 6. Otherwise, if @ > 4, there will be a tendency to over- 
estimate 6. With \ > 1 these two tendencies will be reversed. 

(iv) The last transformation to be considered here is based on the function 
(70) X = f(t) = sinh’ £ = } {cosh 2¢ — 1]. 

It is of recursive type with A = 2 and B = 4. Hence, from Formula (55) 
(71) 6(f, S*) = &[4(1 — 2’) S*, »] (sinh? 4 + 4) — 3. 

Formula (57) with the indicated values of A and B gives 

(72) Elf(4)] = E sinh’ A] = (6 + 3)e 7" — 3. 


In this case f(f~) underestimates or overestimates @ according to whether A 
is less or greater than unity. 


11. Concluding remarks. (i) Formula (54) defining & may seem complicated. 
In fact, the series on the right converges fairly rapidly so that sufficient accuracy 
is obtained with only a few terms. 

It is easy to check that Formula (54) may be rewritten as follows 

x (aS*/2)" ° (aa*/2)" 


(73) Moh) = eartt e+ ea 7 Jeu T a + =?) 
4 k=l 


where 
(74) e = S*/» 


is the unbiased estimate of o*. It will be seen that, for n > 1, the absolute value 
of each term on the right is less than the corresponding term in the expansion of 
exp {aé’/2}. In other words, the series (73) converges faster than the series for 
the exponential function. 

(ii) In some circumstances, the practical statistician may decide to work on 
the assumption that the vairance o° is known. In order to adjust the formulas 
deduced in this paper to this case, it is sufficient to replace é* by o° and pass to 
the limit as y > ~. In particular, this procedure reduces the right hand side of 
(73) to exp {ao*/2}, which is (53). 

(iii) Formulas have been published for correcting the bias introduced by the 
transformation of variables in some particular cases in [1], [6] and [7], for ex- 
ample. However, these formulas do not agree with ours. 

(iv) It is a pleasure to express our indebtedness to the referee who picked up 
several mistakes in the original text of the paper and, in addition, called our 
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attention to the important publications [3], [4] and [9], which we overlooked. 
Among the problems treated in these papers, there is one which is strongly re- 
lated to ours. In the present notation, this problem consists in finding the func- 
tion h(f, S*, v) that is an unbiased estimate of a given function g(y, o°). This 
problem is treated under the restriction that \* = 1/(» + 1). Apart from this 
restriction, in order to reduce our problem to the problem just described, it is 
sufficient to evaluate the expectation E{f[t(«)|} and to denote the result by 
g(s, 0°). The difference between our results and those in [3], [4] and [9] consists 
first in a difference in the method and in the conditions of the various theorems: 
in the earlier papers the conditions are expressed in terms of the function g 
whereas, in the present paper, they refer to the function f. Also, explicit formulas 


for the unbiased estimates of @, as given here, are not contained in the papers 
quoted. 
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ON A PROBLEM OF J. NEYMAN AND E. SCOTT! 


By L. ScHMETTERER 
University of Hamburg and University of California, Berkeley 


1. Introduction. Let = be a one-dimensional random variable distributed ac- 
cording to N(a, 0’) (that means a normal distribution with mean value a and 
variance a’) and f(x) a measurable function defined on —* < x < ». Sup- 
pose that 


((29)'o)™ [men dz = w(a, 0’) 


exists for all real numbers a and all o° > 0 as a Lebesgue integral. Let \ be a 
fixed positive number and 7 a one-dimensional random variable with distribu- 
tion N(a, o’d”). Let ¢ bea random variable such that z/o* has a Pearson-Helmert 
distribution with v degrees of freedom. Further, we suppose that 9 and ¢ are 
independent. The question is whether or not there are unbiased estimates 
H(n, ¢) for w(a, o”) where —*« < a < o, and where o’ > 0. In a paper of 
Neyman and Scott [1] it is proved that there is always an unbiased estimate 
H(n, ¢) for w(a, o*) for the class of entire functions f(z) which satisfy the 
conditions 


(1) ~ (| 42" (0) 1” os o(1), ~ (| **?(0) PP a o(1) 


and which take real values on the real line. Condition (1) can be expressed in 
the following way: f(z) is an entire function of order 2 and type zero or of any 
smaller order. Let us recall that an entire function f(x) is of order k and type 
a = O if | f(re”)| = O(exp{(a + «)r*}) for every « > 0 but for no « < Oif 
r= r(e) and0O S ¢@ < 2z. 

In addition, the following problem is raised in the paper just mentioned: Let 
f(z) be a measurable function defined on the whole real line such that 


(i) [ | f(x) | e dx converges for all « > 0. 


Is there always an unbiased estimate H(», ¢) for w(a, o’) if f(x) satisfies Con- 
dition (i)? In this paper it will be shown that there is for each f(x) satisfying 
Condition (i) an unbiased estimate H(», ¢) for w(a, o’) if X is any positive 
number in the interval 0 < \ S 1. However, if \ is allowed to be >1, then 
the unbiased estimate H(n, ¢) need not exist. Also, it will be shown that the 
theorem of Neyman and Scott mentioned above is, in a certain sense, best. 


Received October 2, 1959; revised February 6, 1960. 
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2. Problem and solution. Let us now consider for any A > 0 and any real 
number v 2 1 the integral equation 


(2) n += ,@ 
ii (2re'T(»/2))™* [ [ H(y, 2)e" 2-0 6 t20* (9 992) 71-1 dy de, 


where —* <a < «,o° > 0 and f(z) is any measurable function satisfying 
(i). We will prove the following theorem. 

TueoreM 1. If f(x) is a measurable function defined on the whole real line and 
satisfying (i), then for all positive numbers \ in the interval 0 < X S 1 there is a 
solution H(y, z) of (2) which satisfies the following “natural’’ condition 

(ii) StS fF | Aly, z)| 0? ee dy dz converges for all « > 0, » > 0. 

Moreover, there is only one solution of this kind (up to sets of measure zero, of 
course). For 0 <  < 1 and v > 1 this solution is given by 


H(y,z) = T (5) ((r)'(1 — »*)'P((» — 1/2)? * 


f f(z) - (z =_— y/Q in n°) )] °°? dx 


—-e <cycnw, O0O<2< ow, 


(2* 2z>0 
(I(x))* = . <0 
z , 


for every real number a. For v = 1 the solution is given by 
H(y, z) = 4fly — (2(1 — »*))4) + fly + (21 — »’))p. 
For X = 1 the solution is given by 
H(y,z) = fly), —eae<cy<n, 0OS2< a. 


For the proof we observe first that the assertion concerning } = 1 is trivial. 
For \ < 1, notice that f(z) is locally integrable by (i). Therefore, it is ob- 


vious that 
+e (x ba y) (v—3) /2 
J(y, z, v) c= [. f(z) ¢ € om a cane ) dz 


is defined for all y, all nonnegative z and for »y 2 3. Moreover, it is easy to show 
that J(y, z, v) defines, for all y, each real » > 1 and almost all nonnegative z, 
a locally integrable function of z, which after multiplication by exp |— 42} for 
any 7 > 0 is absolutely integrable for 0 < z < «. To show this, consider the 
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identity 


yv-l1 ” 2/2¢2(1—d2 
r ( 9 ) (20°)? ‘. fyt+onee™r™ a 
(3) ; 
- [ yr! -wlte® [1% + the tet dt. 


The iterated integral on the right side of (3) is absolutely convergent by (i) 
for each positive \ < 1, each o* > 0, each » > 1 and all real y. But easy trans- 
formation of the integration variables shows that this iterated integral is equal to 


a — 2)" [ eo i o(4(X) + f(¥))(2 — u) 7? dz du 


where X = y + [(z — u)(1 — X*)}' and where Y = y — [(z — u)(1 — 2*)}. 
The absolute convergence of this iterated integral justifies changing the order 
of integration and after an easy calculation we get for this integral 


e eotltet e f(z) ¢ € - =r" dz dz. 


Another application of Fubini’s theorem and the fact that exp {—z/2c’} is 
positive and bounded for all nonnegative z and every o° > 0 show that J(y, z, v) 
is a locally integrable function of z for all y, each » > 1, and almost all non- 
negative z. This function is, after multiplication by exp {— yz}, for any 7 > 0 
absolutely integrable for all nonnegative z. Now we use the following simple 
lemma. 

Lemma. If f(x) satisfies (i) then for any \ in 0 < X < 1 there exists exactly 
one solution g(y, «°) of the equation 


+0 +o 
(4) [ f(x2eoo™ dz = : [ g(y, irr dy, —o <6ecea ; 


for which the integral on the right side of (4) converges absolutely for each o° > 0 
and all real a. This solution is given by 


(5) g(y, a”) = [(2e)'o(1 — *)y" if f(2)eormrou™ dz, 
—o <y< o. 


The uniqueness for a solution of (4) with the asserted property is well known. 
For the proof that (5) satisfies (4) it is enough to notice that 


[(2e)'o((1 a »*)!) 7] [- @ f 7 (2) 81208128) (a) 8/208A8 dy dz 
2 an ee et 
son rf f(x) °” te dx 


and that the iterated integral on the left side of this identity converges abso- 
lutely by (i). 
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Now we proceed with the proof of Theorem 1. Using (3) and the above lemma 
it follows (again by an application of Fubini’s theorem) that J(y, z, v) is a 
locally integrable function for almost all y, -“ < y < @, and almost all 
z,0 <z< «, and that H(y, z) satisfies the integral equation (2). The lemma 
just mentioned and the remarks about the absolute integrability of 
exp{|— nz}J(y, z, v) for all » > O show that H(y, z) satisfies condition (ii). 
The asserted representation of H(y, z) for the case »y = 1 is now obvious. More- 
over, it is well known that there is only one solution H(y, z) of (1) satisfying (ii). 


3. A negative result. Now we will prove that in a certain sense the result ob- 
tained by Neyman and Scott concerning the existence of unbiased estimates 
H(», ¢) for w(a, o”) cannot be improved. 

THEOREM 2. Let f(x) be an entire function with real values on the real line and 
satisfying Condition (i). There is for each real a > 0 an entire function of order 2 
and type a such that (2) does not have a solution H(y, z) which satisfies (ii) for 
any \ > 1. This means that there is no unbiased estimate for w(a, °) according 
to the usual definition of unbiased estimates. 

For the proof take f(x) = exp {— az’}. For a given a > 0, the function f(z) 
is an entire function of order 2 and type a, which takes real values on the real 
line. Suppose there is a solution H(y, z) of (2) which satisfies Condition (ii). 
Then for each o° > 0, K(y, 0°) defined by 


K(y, 0°) = (T(v/2))™ ic H(y, ze" (2/20*) dz/20° 
exists for almost all y, is locally integrable, and 
a K(y eror, dy 


exists as an absolutely convergent integral for all real a, each o° > 0 and any 
\ > 0. Hence, for each } > 1 and each o* > © the equation 


7 —az? —(2—a)?/202 es 2) —(y—a)?/2e%? 
[ ete dz = /r [ Ky, oe or dy 


must be an identity in a for all real a. We write a = —s and get 


(6) 2x0” _— (2a08+1) 08 /ae8h8 om 1 be K( Ff) Per co d 
2ac* + 1 A Le ¥ < 

But it is obvious that these two expressions define for each o° > 0 and each 

\ > 1 analytical functions of s in the whole complex plane, which are identical 

on the real line and so, everywhere. Take now 


(7) o'(X’ — 1) > 1/2a. 


This is possible since 4 > 1. For s = it and real ¢ the right side of (6) isa 
Fourier transform of an absolutely integrable function and must converge to 
zero for \t} — «. But the left side of (6) goes to infinity for |} — by (7). 
This is a contradiction to the existence of an unbiased estimate for w(a, o’). 





660 L. SCHMETTERER 


Remark 1. We notice that the same argument proves the nonexistence of 
more general solutions H(y, z) of (2) in the case \ > 1. For instance, it is easy 
to show that for f(z) = exp {— az}, a > 0, there is no solution H(y, z) of 
(2) for which the integral on the right side of (2) can be written as an iterated 
conditionally convergent integral of the form 


— J N p 
[xr 6) (20°) lim era Bieta lim / Hy, ze tr dz dy. 
«+0 e 
Poo 


M,N+@ mM 


For this we have only to consider the identity in s obtained from (6) by modify- 
ing the definition of K(y, o’) and the right side of (6) in an obvious manner. 
But using (allowed) partial integration it is easy to show that now the integral 
on the right side of (6) exists also for all s of the complex plane. Further, it is 
well known that the Fourier transform of a conditionally integrable function is 
a o(|t|) for \t} + © and this leads again to a contradiction if o” satisfies (7). 

ReMaRK 2. Using Laplace transforms it is also possible to give a more general 
theorem than Theorem 1 which covers also the case \ > 1. Because Theorem 2 
gives a clear idea in which way such a theorem must be formulated, I do not 
think that it is of any value to give it here. 


4. Examples. According to Theorem 1 there is a solution H(y, z) of (2) for 
any \ in the interval 0 < \ < 1 if f(z) is given by exp {— az}, a > 0. For 
0 < A < 1 this solution is given by 


Cem i+ [ ett, as yu) 4,4 cosh (Qay(u(1 pee n*) )*) du, 


where C = I'(v/2)[P'((v — 1)/2)~/x]". This expression and Theorem 2 give 
a complete answer to a question raised in the paper of Neyman and Scott (p. 12 
of [1}). 

Another function f(z) whose inverse is occasionally used as a normalizing 
transform is given by 


f(z) = 


((b+2)", p>O z>—b 
| zs —b. 


If p is an integer it is more usual to define f(z) by (6 + x)” for all real z. More- 
over in most practical applications |b| is a very large number such that these 
two definitions are “almost” equivalent. It appears that there is an unbiased 
estimate H(», ¢) in the case 0 < \ < 1, but it does not seem to be appropriate 
for practical use in general, although it is of course possible for rational numbers 
p and integers » = 2k + 3, where k 2 0, to express J(y, z, v) by elementary 
functions. For instance, for y > —b and z < (b + y)*/(1 — X*) we get the 
following form of H(y, z) where p = n/m, with (n,m) = 1. 
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l 


2(e—1) ryttk-lD—r 
P _— ,2\~(e—-) 7(m+n)/m ad r 2(k 1) (a + y) xX 
" - E 2, (—1) ( r Veet 


— ypimtin _1)* ee _ 7 (a + ayo | 
2; ( o r + ml +2k—-D—7 


where X = b + y + [2(1 — d*)}' and where Y = b + y — [e(1 — d”)|. If p 
is an integer and f(z) is defined by (b + x)” for all z then H(y, z) is given 
by this expression for all y and z 2 0 and any A > O in agreement with the 
results of Neyman and Scott. 


5. Relation to previously obtained results. The proof of Theorem 1 shows that 
hely, 2) = C(d, v2" (2 — (2 — y)*/(1 — 1°, 
—xe<cy<nw, O<z<@ 


k - 
H(y, 2) = mY(»/2)[Ve((1 — ¥))r( — 1)/2) "2" 2? (‘) (—1)*" 


where 


CQ,v) =T (;) [V1 — 0*)*r((» — 1)/2)]* 
is a solution of the equation 
ee orne  (ana't(s/2))* LP aty, sper sm 
-(2/20°)" dy dz 


for every »y > 1, every fixed real z and 0 < A < 1. This meansh,(», {) is an 
unbiased estimate for e*~°"”". This fact is proved and used by Kolmogorov 
[2], who obtains the results of Theorem 1 for the following special cases: f(x) 
is the characteristic function of a measurable set and A = (1/(¥ + 1), = 
2, 3,4, --- . But at least for integers »y 2 2 it is easy, using Kolmogorov’s method, 
to extend his result to the more general assertion of Theorem 1. The method 
of Kolmogorov and my own method for proving Theorem | are of course closely 
related. Another paper which can be mentioned in connection with the method 
used here is one by Washio, Morimoto and Ikeda {3}. 
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ESTIMATES WITH PRESCRIBED VARIANCE BASED ON 
TWO-STAGE SAMPLING 


By ALLAN Brrnpaum' anp WiiuiaM C. HEaty, Jr.? 
New York University and Ethyl Corporation 


1. Summary. A method is given which provides, under conditions satisfied 
by many common distributions, rules for sampling in two stages so as to obtain 
an unbiased estimator of a given parameter, having variance equal to, or not 
exceeding, a prescribed bound. The method is applied to estimation of the 
means of binomial, Poisson, and hypergeometric distributions; scale-parameters 
in general and of the Gamma distribution in particular; the variance of a normal 
distribution; and a component of variance. The use of such estimators to achieve 
homoscedasticity is discussed. Optimum sampling rules are discussed for some 
of these estimators, and some tables are given to facilitate their use. The effi- 
ciency of the method is shown to be high in many cases. 


2. Introduction. In most problems of estimation, estimators based on samples 
of fixed sizes have precisions which depend on unknown parameters, and estima- 
tors with prescribed precision are not available without resort to sequential 
sampling in two or more stages, as in Stein’s procedure [{1] for estimation of the 
mean of a normal distribution with unknown variance. For problems other than 
those of the type treated by Stein the only available general methods which are 
both fairly practicable and efficient seem to be the double-sampling method of 
Cox [2], [3] and the sequential method of Anscombe [4]. The latter methods, 
however, are approximate, being based on asymptotic theory, and there seems 
to be no easily appliceble method available for determining in a given case the 
closeness of the approximations involved. An approach employing a different 
concept of prescribed precision is described by Graybill [16]. 

The method to be described below (developed independently by the authors) 
is a simple one which provides, in a number of problems, procedures for two- 
stage sampling leading to estimators which are exactly unbiased; in certain 
problems these estimators have exactly a prescribed variance, while in other 
problems they have variances never exceeding but generally close to a prescribed 
bound. Under certain conditions, primarily that the precision prescribed is 
sufficiently high, these estimators are shown to have generally high efficiency. 


3. General discussion of the method. 


3.1 Statement of problems. Let S = {x} be the sample space for a single random 
observation, X, on which a density or discrete elementary probability function, 
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f(2, @), is defined for each @ ina given parameter space 2. Suppose that it is 
desired to estimate with prescribed precision a real-valued function p = p(@). 
We adopt the following formalization of this requirement: it is required to find 
find an unbiased estimator of p having variance not exceeding a given positive 
function B(@). 


3.2 Assumptions. 


I. Assume that, for each non-sequential sample size n not less than a known 
no, there exists an unbiased estimator { = t(2z,, +--+ z,) of p, ie., 


(1) Eot(X1,--- Xn) = p(8). 


II. Let o°(@, n) denote the variance of ¢ for sample size n. Assume that, for 
each non-sequential sample size m not less than a known mo, there exists a 
measurable ‘‘second sample size function” n = n(x, --- 2.) taking integer 
values not less than no , and such that either 


(2a) Eyo"(0,n(Xi, ++: Xm)) = B(6) 


or 


(2b) Ey"(0,n(Xi,---Xm)) S B(8) 


holds for each @. 
3.3 Estimation Procedure. Under the above assumptions, a simple unbiased 
estimator of p, having variance not exceeding B(6), is given by any procedure 


of the following form: 

A) Take a sample of m observations (m 2 m), %,°** Zm, and compute 
nm =n(%,°** Im). 

B) Take a second independent sample of n = n(z,-+-- tm) 2 mo additional 
observations, Zm4i, *** Zmin- 

C) Estimate p by t = t(2mii,°** Iman), ignoring at this stage the first 
sample observations 2, , --- 2m. 

The fact that this procedure seems to involve gross waste of information in 
the first sample suggests at first sight that its efficiency must be low. It will be 
shown, however, that the efficiency of the method, with a suitable choice of 
sampling rule, is so high in a number of cases that the search for more efficient 
methods (generally not known at present) would seem to be of more theoretical 
than practical interest for those cases. 

3.4 Properties of the Method. We first verify that, when functions ¢ and n can 
be found satisfying conditions (1) and (2b) above, the method gives unbiased 
estimators with variances not exceeding the prescribed bound. Let 


N= n(X, yt Rul, 
Then the estimate is T = t(Xmii,-+-- Xmsn), and 


(3) ET) = Evx(ErlT | N)) = Ele) = fp, 
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since, for each fixed n = no, we have Eol( Xmii,-+- Xmin) = p by (1). Also 
(4) Vary (T) = Ey Varr(T|N) = Eyo’(0,N) S B(@) 


by (2b); if (2a) holds, Var, (7) = B(@). 

3.5 Efficiency Considerations. A measure of efficiency for any sequential esti- 
mator satisfying (3) and (4), and not restricted to the use of only two stages 
of sampling, may be devised as follows: It has been shown by Wolfowitz [5}, 
under certain regularity conditions on f(z, 6) and p(@) and certain broad con- 
ditions on the sequential sampling rule, that each unbiased sequential estimator 
t of p, together with its total random sequential sample size N’, satisfies 


(5) Vars(T) 2 By (2180 / pyc 


From (4) and (5) we obtain, under the conditions mentioned, the following 
lower bound for the expected total sample size required by any sequential esti- 
mator meeting our conditions (3) and (4): 


(6) BAN) & By (2208L%E0) 7 Bo) 


As will be shown by specific examples below, there does not necessarily exist an 
estimator which attains this lower bound. Nevertheless it is useful to define as 
an index of efficiency the function 


(7) R(0) = B, (2s 26) / (B()Es(N’)), 


where E,(N’) is computed for any given estimate satisfying (3) and (4). As an 
example of the interpretation of this index, suppose that for a given estimator 
we find that R(@) 2 0.90 for all 6; then we can assert that for every estimator 
meeting the conditions (3), (4), and the general conditions of [5], the required 
expected total sample size function E,(N*) will satisfy E,(N*) 2 0.90 E,(N’) 
for all 6; hence average savings in sample sizes of at most 10 % might be achieved. 
It is known that in general the savings actually possible are less than indicated 
by such bounds, e.g. less than the 10% indicated here.) Such efficiency bounds 
are given in the following sections for various specific problems. 

The estimation methods of the present paper are roughly similar to the method 
of Stein [1] for estimation of a normal mean. For most purposes the prescribed- 
length confidence interval formulation adopted by Stein seems preferable to the 
prescribed-variance formulation adopted here; the present formulation is akin 
to a decision-theoretic one with mean-squared error loss function, but the restric- 
tion of unbiasedness which provides essential simplications of calculations also 
geverally entails some inefficiency from this standpoint. While Stein was able 
to give exact confidence intervals by determination of the exact (Student’s) 
distribution of the point estimator implicit in his method, the exact distribu- 
tions of the estimators given here are not known. Consequently this paper makes 
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no contribution to the theory of exact interval estimation comparable with 
Stein’s, apart from the following crude use of Tchebycheff’s inequality: If 6 is 
an unbiased estimator of @ with variance not exceeding a constant B, then the 
interval estimator 6 + « covers 6 with probability at least 1 — a where a = B/€. 
For many of the problems considered below, even such confidence intervals 
have not previously been available. (For a number of problems, a method of 
constructing confidence intervals of fixed length and confidence coefficient, but 
probably poor efficiency, was given in [6}). 

In many cases, particularly those in which high precision is specified, the 
estimators given here have approximately normal distributions. This is illus- 
trated in the Poisson case below. To the extent that this is true, all methods for 
confidence regions and significance tests based on assumptions of normality 
with known variance may be applied. Useful approximations to the distributions 
of some of the estimators can probably be based on Student’s ¢ distributions 
with the number of degrees of freedom determined by a fitting of fourth moments; 
further investigation of this possibility is required. 

It should be noted that, with the methods of this paper, there will sometimes 
occur samples which on inspection strongly suggest that some modification of 
the estimators given here would be more appropriate and efficient. A similar 
comment applies, with somewhat less force, to Stein’s procedure and some other 
sequential procedures. These features seem symptomatic of possible improve- 
ments in efficiency of these methods which have not yet been found. They seem 
also to point to more basic problems in the foundations of statistical inference 
which lie outside the scope of the present paper. The estimators given in this 
paper have variance and efficiency properties which are valid within the uncon- 
ditional two-stage sampling probability framework; these properties are not 
considered here (except in some computational steps) conditionally on a given 
first or second sample size. The unbiasedness properties of these estimators 
generally hold both conditionally and unconditionally. 


4. Estimation of a mean. Suppose that X is real-valued and that the mean 
6 = p(0) = E,(X) 


is the parameter to be estimated. Then Assumption I is obviously satisfied if 
we take 


1 nm 
t = t(Zasi, *** Zmpn) = — Di tats» 


tl 


Letting os = Var, (X), we have o(0,n) = o3/n. Condition (4) becomes 
(8) E,(1/N) = Epl/n(X1,--- Xm) S B(0)/o6. 


Then any integer m 2 mo, and any function n = n(x, --- 2») satisfying (8), 
may be used to define an estimator, which will then automatically satisfy (3) 
and (4). Such an estimator has expected total sample size 


(9) E,(N’) = m+ Ej{n(X,, --- Xn)]) = m+ E,(N), 
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and efficiency bound 


(10) R(6) = Es (2 eX. 01) / B®) [m + E,(N)]. 


In the special case of constant prescribed precision, B(@) = B, (8) becomes 
(1/B)Eg{1/n(X,,--- Xm)] S 1/05, and the problem of finding a suitable 
second-eample-sise function "(z;,-°- Zu) may be stated as the problem of 
finding an estimator 1/é° of 1 /o¢, based on m observations, which is unbiased 
(condition (2a) ), or which has positive bias at no @ ¢ 2 (condition (2b)). Then 
the sequential sampling rule may be stated as: 


A’) Observe 2 , ---: 2», and compute é = 6'(a,,--+ tm). 

B’) Take a second sample of n = 6°/B observations Im4i,°** Iman: 

C’) Estimate the mean @ = E,(X) by the mean t = 1/n > -foizm4; of the 
second sample only. 


It is sometimes convenient to define 6°(x, , --- , 2m) formally in such a way 
that 6’/B is not always an integer. Then for most applications it will suffice to 
take n as the smallest integer not less than é’/B. A calculation like that above 
shows that this gives again Vary (7) < B. Alternatively, given 6’/B, we could 
use a random device to choose n = [é’/B] = the largest integer not exceeding 
6’/B, with a probability y, and n = [é*/B] + 1 with probability 1 — 7, where 
y is determined by the equation 


yle*/BY' + (1 — y)({e?/B)] + 1)* = B/é’. 


The latter procedure, which is perhaps of primarily theoretical interest, gives 
Var, (T) = B exactly if Ei /e’(X, ,-+* Xm)] = 1/o* exactly. Henceforth we 
write n = é’/B to indicate that one of these procedures is used in defining n. It 
follows that calculations based on the equation n = 6’/B, such as the equation 


Em(X, skids X wm) — Eye" (X, oe Xm)/B 


used below, may involve an error whose magnitude is in any case less than one. 


Similar remarks apply to cases of the method other than those of estimation of 
a mean. 


For any such procedure we have expected total sample size 
E,(N’) = m+ Em(X,, --: Xm) = m+ (1/B)Fe8?(X,, --- Xn), 
and efficiency bound given by 


1/R(0) = (B/o3)Ey(N’) = (Bm/os) + (1/05) Ey’. 


If B is sufficiently small, and/or if @ is such that 09 is sufficiently large, it is true 
in many cases (as illustrated below) that (1/03)Ks6" = 1 and that Bm/os = 0, 
and hence that R(@) = 1; in such cases, for the indicated range of @, no appreci- 


able improvements in efficiency are possible even by resort to fully sequential 
estimators. 
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Such estimators have been found and investigated quantitatively for a num- 
ber of common problems. These results are summarized in the following para- 
graphs. 

4.1 Poisson Mean. If X has the Poisson distribution f(z, 0) = ¢ 6" / x! for 
x = 0, 1,---, we may take 


fr 2 “*e = . 
6 = O(n, Im) (S241) /m, 


since y = >t x; has the Poisson distribution f(y, m@) = e ™mo)"/y!, y= 
0, 1,2, +--+ , and 


E,(1/6?) =m > sy, mé) /(y + 1) = me™ 5° (mo)*/(y + 1)! 
y=0 v=o 


= (e~™/@) > (me)’"*/(y +1)! = (1 — 6 ™)/0 < 1/6 = 1/o}. 


When the second sample size is determined by n = /B = (y + 1)/mB, the 
expected total sample size is 


Ey(N') = m+ Ey(n) = m + (1/mB)Egy + 1) = m + (mé + 1)/mB. 


This is minimized by taking m + 1/B'”, regardless of the value of @. (In other 
examples, an optimal first sample size is not so simple to determine.) Then 
E,(N’) = 0/B + 2/B”. 

This estimator has efficiency bound R(@) given by 1/R(@) = 1 + 2B'"/6. 
If ‘or example, 6 = 8B", then R(@) = 0.8, and a decrease of at most 


100(1 — R(6))% = 20% 


in £,(N’) might be possible by resort to some (unknown) more refined sequential 
procedure; for @ >> 8B"”, the possible gains are negligible. 

An alternative two-stage estimator (of the mean of a Poisson process) em- 
ploying “inverse” sampling in the first stage, given in [10], has exactly variance 
B, but can be shown to be less efficient. 

The following discussion illustrates that such estimators can have approxi- 
mately normal distributions. For any fixed 6 > ©, B > 0, and k, we may write 
Prob {(T — 6)/B? < kj} = EyU(N, 6, k, B), where 

U(N, 6,k, B) = Prob{(T — 0)/B’’ < k| N}. 
For sufficiently large fixed N, 

U(N, 6, k, B)= Pr{(T — 0)/(0/N)"” < kB'?/(0/N)'"| = @(k(BN/6)"”), 

where #(u) is the standard normal c.d.f. As B — 0, the random variable 
(k(BN/6)"") 


converges in probability to the constant #(k), as does the random variable 
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U(N, 6, k, B) with N random. Since 0 Ss U gS 1, we have 
Ey(U) = Prob{(T — 0)/B'” < k} + ®(k) 


as B decreases, proving the asymptotic normality of T. 
4.2 Binomial Mean. If X has the binomial distribution 


f(z, 0) = 67(1 — 6)” 


we may take 


e=(1-2 (m0) (S> 2 +1)(m+1 -¥ 2) /(m + 1)(m + 2), 


since (by a calculation similar to that in the preceding section) 
E,(1/6?) = (1 — o"*? — (1 — 8)"**)/(1 — 2"* a — 8) 

< 1/0(1 — 6) = 1/o%. 
The expected total sample size is 


7 ’ ee ee —(m+1) m(m — 1)0(1 — 6) +m+1 
EN’) = m+ (1/B)E,¢@ = m+ (1 ‘= Hat tim +h 


4m + 2 
morn + -o0- A). 


The latter expression does not yield a minimizing value of m independent of the 
unknown @, but for any chosen B and guessed value @ a minimizing value 


m = m(06, B) 


anm+(i- f°") _.__.. 


can be found by numerical solution of the equation = E,(N’) = 0. Table 1 pro- 


vides some such values. 


TABLE 1 
_ Best Binomial First Sample Sizes m(@, B) 











0.5 

0.4 or 0.6 

0.3 or 0.7 

0.2 or 0.8 ll 
0.0 or 1.0 18 


The value m = 0 indicates use of a single sample procedure with n = 1/4B 
observations. However, calculations of E£,(N’) for various values of B and m, 
such as those given in Table 2, indicate that for B s (0.05)? a choice of m such 
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as m(0.2, B) provides appreciable savings as compared with m = 0 over a wide 
range of @ at the cost of a relatively small loss as compared with use of a best 
value m(@’, B) based on any guessed value 6 which happens to be correct. 


TABLE 2 
Values of Ey(N') for Binomial Estimates 
(a) B = (0.05)? 


100 112.3 
100 109.4 
100 101.0 
100 86.9 
100 67.1 
100 

(b) B = (0.005)? 


47 


0.5 10,000 10 ,056 10,120 


0.4 or 0.6 10,000 9,688 9,734 
0.3 or 0.7 10,000 8,585 | 8,573 
0.2 or 0.8 10,000 | 6,746 
0.1 0r0.9 | 10,000 4,173 
0.0 or 1.0 10,000 | 863 





The efficiency bound R(@) of such estimators is given by 
1/R(0) = Bm/@(1 — 6) + (1 — 2°"*”) 
‘{1/[(m + 2)0(1 — 6)] + 1 — (4m + 2)/[(m + 1)(m + 2)}}; 


For any given B, the values of m to be considered are 0 S m S m(0, B). If we 
take m = m(0, B) = 1/B’® — 2 = 1/B" (for B < (0.05)’), we have 


1/R(6) = 1 + B’"(2/@(1 — 6) — 4). 


For any B, as 6 — 0 or 1, R(@) — 0; but these are values of @ for which the 
lower bound o4/B on E,(N’) cannot be attained by any estimator with the 
desired properties. For any fixed 6,0 < @ < 1, as B-— 0, R(@) — 1; thus the 
efficiency of 6 cannot be much improved upon when high precision is required. 
Analogous statements hold if we take, for example, m(0.2, B). 

The formula for the second sample size is 


n=e(S2+1)(m+ 1 -Zx), 
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where c = c(B, m) = (1 — 2°“"*”)/B(m + 1)(m + 2). Table 3 provides 
some values of c(B, m). 
TABLE 3 
Values of c(B, m) for Binomial Second Sample Sizes 


m 





3.03 18.94 75.76 
1.47 9.19 36.76 
0.87 5.41 21.64 
0.57 3.56 14.24 
0.40 2.52 10.08 
0.30 | 


B 
(0.05)? (0.02) 2 i (0.01)? (0.005) * 


The variance of 6 is 
Var, (6) = B(L — 6" — (1 — 0)"*)/1 — 2°"*”) 


this is appreciably less than B only when @ is very near 0 or 1, provided m is 
not very small. 

There exists, as in the Poisson case, a procedure employing “inverse” sampling 
to yield a binomial estimator having exactly constant variance as follows: 

Let M be a fixed positive integer. Make successive independent Bernoulli 
trials (samples of size one) until min (total successes, total failures) = M. Let 
x be the number of trials up to and including the Mth success. Let y be the 
number of trials up to and including the Mth failure. Take an additional sample 
of sizen = M/B(x + y) and let z denote the number of additional successes 
observed. Then 6 = Bz(x + y)/M is an unbiased estimator of @ having variance 
exactly equal to B. It seems clear that the expected sample size will be larger 
for this “inverse” sampling plan, and that as a practical matter exactly pre- 
scribed variance would seldom be worth the cost in additional observations. 

4.3 Hypergeometric mean. In a finite population of known size M, let @ = D/M 
be the unknown proportion of items having a given trait, e.g. being defective. 
Let X denote the number of defectives in a first sample of size m, n = n(x) 
the size of a second sample, and Y the number of defectives in the second sample. 
(All sampling is without replacement.) Then it is readily verified that an un- 
biased estimator of @ is 


6 = [((Y(M — m)/n) + X\/M. 
This estimator will have variance bounded by B if we take 


-eitel ol (M — m)*(x + 1)(m— 2 + 1) 
BM?(m + 1)(m + 2) + (2 + 1)(m — x + 1)(M — m) 

since 

(D —2x)(M —m— D+2)(M — m—n) 


rar(@|x2) = . 
= (if — = — 1)iin 





ESTIMATES IN TWO-STAGE SAMPLING 


and hence 


var 6 = E var(6| z) 


ts y (P= 2) D+ iM — m8) (DY (MD) / (mM) 
= £(M-m-1)M ~ \r]/\m-2 m 
= 5 (M =m —n)(2+1)(m—2+1)(M = m) 

= M'n(m + 1)(m + 2) 


(2 :) Le sey is 2) 


_ S (M—m—n)w(m+2 —w)(M — m) 
wat1 2=—Si(isté‘i«ézM (mm + 1)(m + 2) 


; A We ed Y fr 
= BS (2)(8 59) 1 wey. 


Exact results on expected sample sizes are not available, but an indication of 
the possible savings is given by regarding the results in the binomial case to be 
limits approached as M becomes infinite. Additional information is given by 
the range of n(x): for m even, 


(M — m)’ 
prensa emia < = Ny’ 

"+ Pitt + Ww) TT aN am 

__— (M = m)*(m + 3)? | 

4BM*(m + 1)(m + 2) + (M + 3)°%M — m)° 

With a single sample of size r, the best unbiased estimate has variance not 
exceeding B for all 6 provided r is at least M/(4B(M — 1) + 1). Uf for example 
M = 1,000, B = .0001, and m = 80, then r = 714 while 173 & m + n(x) S 728; 
when @ = 0 or 1, the two-stage estimate saves 541 observations (76%), while 
when @ = 4 its maximum sample size exceeds r by less than 14 observations 
(2%). 

4.4. Mean of a Normal Distribution with Unknown Variance. If X has the 
normal density function f(z, 6, ¢) = (2eo")* exp (z — 6)*/20"), with o* un- 
known, we may apply the present method with an advantageous modification 
based on the independence of %, = (1/m) > 2, and 


a 


s = (1/(m — 1) 2 (x — #)’. 

Take any m > 3, let 6? = (m — 1)s’/(m — 3), n = max (6, 6’/B — m\, and 
6 = (1/(m + n)) >-7" z, . It is easily verified that the latter estimator is un- 
biased and has variance not exceeding B. For most purposes Stein’s procedure {1} 
which gives confidence intervals of prescribed length will probably be preferred ; 
optimal choice of m for this procedure has been extensively investigated [7] 
and [17]. 
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44 Estimation of a Scale Parameter. Let X have a density function 
f(z,0) = (1/6)g(z/6), «x20, 


and f(z,@) = 0 otherwise, where g(u) is a known function with 


¢ = [ ug(u) du = E(X |@ = 1), 
0 
- 2 , 2 

Ce [ ug(u) du = E(X’|@=1) < @, 
0 


a= [ w%glu) du = E(1/X*|0=1) <e. 


0 
Then E(X/c,) = 6, Var (X/e,:) = 0(e2/ci — 1) =o’, say, and E(1 ‘X*) = ¢,/6. 
Letting ¢° = mes(e/ci — 1)[> 7 1/z%]*, we have 

E(1/8*) = 1/{es(e2/ci — 1)JE(1/X*) = 1/0’. 


Thus an unbiased estimator of the scale-parameter 6, having variance B, is 
é m+n 


= i=m+1 2;/eyn. The choice of m may be made so as to minimize, at any 
guessed value of 6, 


E,(N’) = m + E,(n) = m + E,(6")/B. 


For any specific density function f(z, @), it may be possible to find an estimate 
e preferable to the é given above: & = 6 (x1, +++ tm) is clearly preferable to 
é if it (has the essential property E(1/é*) < 1/o° and also) makes 4 more 
efficient, that is, if E(a*) < E(é*). (This remark may be applied also to the 
estimators 6° discussed in other sections of this paper. ) 

For example, if X has the gamma density 


f(z, 0) = (1/Oa!)(x/o)*e *"*, «x20, 
where a is known, a > —1, then 


Cy 2 u“"e“du=at+l1, 


a! 
GQ = = I u"*e“ du = (a + 1) (a +2), 
a 


and, provided we require a > 1, 
1 FS ae-—2 -—w 
G=—] ue“ du = 1/ala — 1). 
a. 49 


Thus 6” = m[(a — L)a(a + 1) 07.1 1/27)". The evaluation of E,(6*), required 
to compute E,(N’), appears difficult for m > 1 and has not been carried out. 
For m = 1, 


E,(6") = Ey(X*)/(a — l)ala + 1) = 0er/(a — 1)ala + 1) 
= #(a + 2)/a(a — 1), 
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and E,(N’) = 1 + @(a + 2)/a(a — 1)B. Presumably the estimator 


¢*=K (= ny 
where K is such that E,(1/a*) = 1/0’, is preferable to 6” in this example, since 
é’ is a function of the sufficient statistic >-T 2; (based on the first sample) and 
é° is not. Results of Ghurye [15] lend support to this conjecture. An estimator 
based on 3’ is given in Section 5.3 below. 
The estimation of any given power 6” of a scale-parameter 0, p = +1, +2, --- 
may be treated similarly. 


5. Other estimation problems. 


5.1 Variance of a normal distribution with unknown mean. Let X have the 
normal density function with unknown mean and variance as in Section 4.4, 
but let @ now denote the unknown variance. For any m > 5, let 


n = 2s'(m — 1)*/B(m — 3)(m — 5) + 1, 


where s° is the first sample variance defined as above. Then it is readily verified 
that an unbiased estimator of 6, with variance not exceeding B, is given by the 
second sample variance. 


é ~¥ (2. - > 1) / (n- 1), 


m+ 


For given B and a guessed value of 6, m may be chosen so as to minimize E,(N’). 

5.2 Estimation of a “between classes’’ variance component. Consider the usual 
assumptions for a one-way analysis of variance, with n observations from each 
of k classes: Yj; = wp +e; + e:j,1 = 1,°--k,j = 1,+++ n, with w an unknown 
constant, and the c,’s and e,,’s all indeperdently normally distributed with means 
zero and unknown variances 


var (c;) = 00, var (€;) =o, t=1,---kj,j=l,---n. 


The usual between classes mean square so has expected value o”° + nos and k — 1 
degrees of freedom. The usual within classes mean square s has expected value 
o and k(n — 1) degrees of freedom. Then (s> — 8°)/n is an unbiased estimator 
of o3 , with variance 


2(o" + noo)’ /(k — 1) + o'/k(n — 1)]/n’ 


when k and n are fixed. 

Alternatively, suppose a first sample of r classes and n observations per class 
has been taken. Let 73 and 7” respectively denote the between and within 
classes mean squares, based respectively ony) = r— 1>4andy=r(n—1)> 4 
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degrees of freedom. Then it is easily verified that 
E(v — 2)(v — 4)/%T> = 1/(0° + nos)” 
and 
E(v — 2)(v — 4)/T* = 1/o". 
This leads to the choice of k defined by k = max (2, k’, k”) where 
k’ = 1 + 2Tov5/(vo — 2)( — 4)(B — b) 
k” = 2T*/(» — 2)(v — 4)bn?(n — 1) 


and b is any constant, 0 < b < B. To see that with k so defined, the sampling 
variance of 6} is less than B, observe that 


B = 2(0° + noo)*E(1/(k’ — 1)) + (o*/(m — 1))E(1/k")|/n? 
2[(o” + nos)*E(1/(k — 1)) + (o*/(n — 1))E(1/k)|/n? 
var (6). 


The choice of n would ordinarily be influenced by practical limitations on the 
experiment, and the choice of both n and b could also be governed by an a priori 
estimate of o’/a} . 

An alternative approach to the present problem is to apply twice the method 
of the preceding Section 5.1 as follows: Estimate (03 + 0°) by a two-stage 
estimator s; having variance not exceeding B, < B, based on observations 
Yu, Ya,-** YmaYomsna,*** Vomen)a, 80 that only one observation is taken 


from each class. Secondly, estimate o” by a two-stage estimator s; having variance 
not exceeding B, , where B, = B — B, , based on additional observations within 
any one class (or on additional ‘‘within degrees of freedom’’ from several classes). 
Then s’ = si — 82 is the required estimate, for E(s*) =o, and 


var (s°) S B, + B, = B. 


Rules for optimal choice of B,, and comparisons with the preceding method, 
remain to be developed. 

5.3 Scale parameter of a gamma distribution. If X has the Gamma density de- 
fined in Section 4.5 above, 


Var (X/e:) = Var (X/(a@ + 1)) = @(e2/ci — 1) = @/(a + 1) =o’, 
and we may take 


™ 2 
é = (= 2) /(a + 1)(ma + m — 2)(ma +m — 1), 
1 

for all a and m such that (ma + m — 2) > 0. This gives F( 1/6) = 1/0’, and 
E(e*) = &(ma + m + 1)(ma + m)/(a + 1)(ma + m — 1)(ma + m — 2). 
For any guessed value of @, m may be chosen, subject to m > 2/(a + 1), so as 
to minimize E,(N’) = m + E,(6*)/B. 

A modification analogous to that in Section 4.5 above, replacing > a by 
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(>= z,)” throughout, with a corresponding modification of constants, gives an 
estimator of 6” with variance B. 

6. Applications to achieve homoscedasticity. Many standard techniques for 
comparing means, related to the analysis of variance (Model I), are seriously 
dependent for validity on the assumption that observations have (approxi- 
mately) equal variances, but much less seriously dependent on the usual assump- 
tion of normality (see, for example, [8] and references therein). It is frequently 
desired to apply such methods to means of observations having some of the 
distributions considered in Section 4 above; but in such cases the unknown 
variances are functions of the unknown means of observations, and hence the 
assumption of equal variances generally holds only when the unknown means 
happen to be equal. 

The methods of the present paper provide a way of meeting this difficulty 
which may be considered in cases where it is feasible to use a two-stage sampling 
method providing (approximately) a common prescribed variance B for the 
observation in each cell of any Model I experimental design. Techniques re- 
lated to analysis of variance will be used taking, formally, the case of an infinite 
number of degrees of freedom for the error mean square; the latter, of course, 
will not be calculated from data, but the known variance B will be used instead. 
The methods which are usually considered for meeting this difficulty are vari- 
ance-stabilizing transformations of the observations (see, for example, [9}). 
Concerning the relative advantages and disadvantages of these approaches, 
it should be noted that the goal of (a) variance-stabilization for application of 
standard inference techniques is usually of interest simultaneously with certain 
goals of (b) precision of estimation (or power of tests), (c) efficient utilization of 
data obtained, and (d) simplicity of interpretation. Concerning (d), use of the 
methods of this paper offers some advantages over use of transformations since 
the former provides inferences directly about the means of interest with pre- 
scribed precision on their original scale, rather than inferences about functions 
of those means (e.g., E(sin~*(z/n)*) in the binomial case) which are often harder 
to interpret and perhaps less meaningful. Furthermore, the latter estimators 
lose their constant-precision property when interpreted in the original units 
of the parameters. 

In cases like the Poisson there is no single-sample procedure which provides 
even bounded, let alone prescribed, precision in the original scale. Hence if such 
prescribed precision is one goal of interest, sequential methods more or less 
like those of this paper are required, and the simultaneous achievement of sim- 
plicity, and of exactly or approximately known common variances of estimators 
of means, may be regarded as convenient desirable by-products of the method. 

In cases like the binomial, the goals of bounded precision and homoscedasticity 
are attainable by use of transformed single-sample estimates. In the binomial 
case, we have seen above that when high constant precision is desired, the two- 
sample estimate is on the whole rather efficient, and in this case again affords 
the properties of homoscedasticity and simplicity. If only low precision is re- 
quired, there is some conflict between the goals mentioned. For example, for 
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binomial estimation with B = (.05)’ it was shown above that a first-sample 
size of m 2 20 gives an inefficient estimate, but m 2 20 is required for a good 
degree of homoscedasticity. In such cases efficiency considerations may be 
weighed against considerations of simplicity of application and interpretation. 
If it can be assumed that .2 S @ S .8, then m 2 10 suffices to give a variation 
of at most 7% in variances of 6’s. If it can be assumed that .1 Ss 6 < .9, the 
variation is at most 10% if m 2 20. 
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A MONOTONICITY PROPERTY OF THE SEQUENTIAL 
PROBABILITY RATIO TEST! 


By Rosert A. WissMAN 
University of Illinois 


0. Summary. Using the basic inequalities (1) it is shown that, if, in a sequential 
probability ratio test, the upper stopping bound is increased and the lower 
stopping bound decreased, and if the new test is not equivalent to the old one, 
then at least one of the error probabilities is decreased. This implies the monoto- 
nicity result of Weiss [5] in the continuous case, and the uniqueness result of 
Anderson and Friedman [1] in the general case. The relation of the monotonicity 
property to the optimum property and the uniqueness of sequential probability 
ratio tests is discussed. 

The monotonicity property is a consequence of the following stronger result. 
Let the old and new tests be given by the stopping bounds (B’, A’) and (B, A), 
respectively, with B < B’ < A’ < A; let (a, a2) and (a, a) be the error 
probabilities and Aa; = a; — a; the changes in the error probabilities; then the 
vector (Aa , Aa) is restricted to a cone consisting of the 3rd quadrant, plus 
the part of the 2nd quadrant where — Aa,/Aa, < B, plus the part of the 4th 
quadrant where —Aa/Aa, > A. Another consequence of this result is that 
(a , a2) cannot lie in the closed triangle with vertices (as , a), (0, 1) and (1,0). 
Finally, the following monotonicity property follows: If the lower stopping 
bound is fixed and the upper stopping bound increased, then a,/(1 — a) de- 
creases monotonically. The same holds for a./(1 — a) if the upper stopping 
bound is held fixed and the lower stopping bound decreased. 


1. Introduction and discussion. We consider Wald’s sequential probability 
ratio test [3] with upper stopping bound A and lower stopping bound B. It is 
usually assumed that B < 1 < A, but no such restriction will be made in this 
paper. Weiss [5] has shown, under certain continuity assumptions, that, if A and 
B are separated in such a way that one of the error probabilities remains con- 
stant, then the other error probability decreases monotonically. This is a very 
useful result, since it not only provides a uniqueness proof, but also it shows that 
there exists a test of given strength if and only if the error probability vector lies 
in a certain set [6]. In this paper a monotonicity property will be proved which 
makes no assumptions as regards to the probability distributions (other than that 
they be non-degenerate) and which include Weiss’ result as a special case. The 
monotonicity property, stated and proved in Section 2, can be described as fol- 
lows: if the upper stopping bound of a sequential probability ratio test is in- 
creased and the lower stopping bound decreased, then at least one of the error 
probabilities decreases, unless the new test is equivalent to the old one, in 
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which case the error probabilities are, of course, unchanged. (Two tests will be 
called equivalent, more or less following [1], if their sample sequences differ on a 
set of probability 0 under both distributions.) Weiss’ result is obtained as a par- 
ticular case by specifying the distributions to be continuous, with positive proba- 
bilities in non-degenerate intervals, and by reading the conclusion: then if one 
of the error probabilities is fixed, the other decreases. 

Before proving the indicated monotonicity property, its relation to the 
uniqueness and to the optimum property [4] of sequential probability ratio tests 
will be discussed. In [1] it is shown how the optimum property can be used to 
prove uniqueness, i.e. the fact that two sequential probability ratio tests with the 
same error probabilities are equivalent. The restriction B < 1 < A had to be 
made, though, since the optimum property had been proved only under this 
condition. Actually, this restriction is unnecessary. It will be indicated in a future 
paper [2] that any sequential probability ratio test has the optimum property 
among all tests which take at least one observation. In particular, then, every 
sequential probability ratio test has the optimum property among all sequential 
probability ratio tests, which is all that is needed in the uniqueness proof in [1]. 
This kind of optimum property will be labeled restricted in the following. 

First of all it will be shown now that the restricted optimum property and the 
monotonicity property are equivalent. The following notation and terminology 
will be used: the error probabilities corresponding to the two distributions under 
consideration are denoted by a;, i = 1, 2; the expected sample sizes are y; ; in 
passing from one test to another, Aa; and Av; denote the changes in the a; and 
v;; a test will be called inadmissible if there exists another test such that Aa; S 0, 
Av; S 0,7 = 1, 2, with strict inequality in at least one of the four. Obviously, 
the optimum property implies admissibility, and the restricted optimum prop- 
erty implies restricted admissibility, i.e. admissibility within the class of sequen- 
tial probability ratio tests. Consider a sequential probability ratio test (B, A) 
and another, (B*, A*), with B* s B < A s A*. Unless the two tests are equiva- 
lent, we have Av; > 0 for both 7 (see Section 2 for support of this statement, and 
similar ones to follow). Assume the restricted optimum property. This implies 
restricted admissibility, and this implies that Aa; < 0 for at least one 7. In other 
words, one of the a; has to decrease, which is the monotonicity property. Con- 
versely, assume the monotonicity property, and compare tests (B, A) and 
( B*, A*), which are supposed to be not equivalent and for which Aa; S 0 for 
both 7. Then we cannot have B* s B, A* s A, for in that case Aa; > 0 and 
Aa, < 0. Similarly, B* = B, A* 2 A is excluded. Also B < B* < A* < Ais 
excluded since otherwise, by the monotonicity property, one of the Aa; would 
be positive. Hence the only remaining possibility is B* << B < A < A*, which 
implies Av; > 0 for both 7, i.e. the optimum property. 

Secondly, the monotonicity property implies uniqueness. For, if the stopping 
bounds are changed in the same direction, then both error probabilities change 
(in opposite directions), whereas, if the stopping bounds are changed in opposite 
directions, then according to the monotonicity property at least one of the error 
probabilities changes; unless, of course, the two tests are equivalent. 
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It is true that a separate proof of the monotonicity property is not strictly 
necessary, since this property is a consequence of the optimum property. How- 
ever, the optimum property is a rather deep theorem, requiring a sizable machin- 
ery for its proof, whereas the monotonicity property follows in an elementary 
way from the basic inequalities (1). Since two interesting properties of sequential 
probability ratio tests within their own class—the uniqueness and the restricted 
optimum property—are immediate consequences of the monotonicity property, 
it seems worth-while to prove the latter independently. Moreover, the methods 
used yield a stronger result, which does not follow from the optimum property 
and which has, besides the monotonicity property, some other interesting con- 
sequences. These further results are obtained in Section 3. 


2. Statement and proof of the monotonicity property. Let X,, X.,--- bea 
sequence of independent and identically distributed random variables (or vec- 
tors) with common density p; with respect to some sigma-finite measure. Here 
and in the following, i runs over 1 and 2, corresponding to the two hypotheses 
under consideration. The trivial case p, = p. a.e. will be excluded. Let Y, be 
the probability ratio at the nth observation, ie. Y, = []}o: po(X,)/pi(X,). If 
some stopping rule is defined, let N be the random number of observations. Of 
fundamental importance in what follows is the basic double inequality 


(1) aP,(a < Yy <b) S Pi(a < Yy < b) S bPi(a < Vx < b) 


for any real numbers a and b, including «. The strict inequality signs within the 
parentheses in (1) may be replaced by less-or-equal signs, and we will do so 
whenever this is convenient. For instance, the following inequalities will be con- 
sidered special cases of (1): 


(2) PAYx = a) 2 aP,(Yy = a) 
(3) Pf Yy s b) < bPi( Yy < b). 


These basic inequalities have been used already by Wald ([3), Section 3.2) and 
are briefly discussed there. Also Weiss [5] makes use of (3). An important cunse- 
quence of (1) is that either 


(4) P\(a < Yw < b) = Pia < Ywy < b) = ( 


Py(a < Yu < b) >0 and Pia < Yn < b) . 0. 


As an application, compare the sequential probability ratio tests (B, A) and 
(B, A*), with B < A < A*. In (4) and (5) identify a with A, b with A*, and 
N with the random number of observations if test (B, A) is used. If (4) prevails, 
the two tests are clearly equivalent. If (5) prevails we can conclude Aa, < 0, 
Aa, > 0, Av; > O for both i. Similar conclusions can be drawn if both stopping 


bounds are changed, and these facts have already been used in the discussion 
in Section 1. 
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In a sequential probability ratio test with stopping bounds s and t(s < t)’ 


and random number of observations N, the error probabilities a; are functions of 
s and t, 


(6) a(s,t) = Pi(Yy 2t) = 1— Pi(Yw S 8), 
(7) on(8,t) = Px(Yy S 8) = 1 — P2( Yn 2 bt). 

It is convenient to introduce the functions U; and V; defined by 
(8) U;,(s,t) = Pi(Yw S 8) 

(9) Vi(s,t) = Pi Yn 2 t) 

if s and ¢ are the stopping bounds. We have 

(10) Ui(s,t) + Vis, t) = 1, 

and the relation between the a; , U,; and V; is simply 

(11) a (8,t) = V4(s, t) on(8s,t) = U2(s, t). 


TueroreM 1. Let (u, v) and (u’, v’) define two non-equivalent sequential proba- 
bility ratio tests, withO < usu’ <v Sv < @, and let Aa; = a(u,v) — 
a;(u’, v’), t = 1, 2. Then at least one of the Aa; must be <0. 

Proor. Let N be the random number of observations in the test (u’, v’), and 
define F;(y) = Pi(Yw S y). Then, using (11), we have 
(12) Aa = V,(u,v) — Vilu’, v’) 

(13) Aa, = U2(u,v) — U2(w', v’). 


We compute’ 


(14) Vw’) = fo ary) + [ ari(y) 


Vilu, v) = [ dF ,(y) + [ Vi (:,2) dF ,(y) 


(15) y y 


+f v(%,2) arcu. 
“ y y 


In (15) we used the fact that the Y,4,/Y, are independent and identically dis- 
tributed. Substitution into (12) and using (10) gives 


(16) Ao, = Vi(~,2) ary) — | U,("*,2) aFi(y). 
. y’y ” Ny’y 


? For notational convenience we shall henceforth use lower case symbols instead of A 
and B for the stopping bounds. 

3 In (14) the lower limits on the integrals should, and the upper limits should not be 
included in the integrations. On the other hand, in (15) in the third integral on the right 
the lower limit u should not be included and the upper limit u’ should. These facts have not 
been made explicit in the formulas, since they are inessential for the proof. 
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Similarly, 


a er a oa wee Penge 
(17) Aa, = [ ue(%,2) art -{ va (4,2) day). 


Suppose temporarily that vr’ < v. Then for y in the interval [v’, v) we have, using 
(8) and (3), 


(18) u,(%,2) s% us(¥,2) s3u,(%,2), 
y’y) ~ y y’yJ/~ v0 yy 


and, using (1), 
(19) dF(y) Ss vdF,(y) 


so that 


(20) [ Us (“.") dF,(y) < +f Ur (“.°) dF,(y). 


In (19), and therefore in (20), there is strict inequality unless 


(21) [aray) =0 for both i. 


If v’ = v, (20) remains trivially true. Similarly, 


‘ fee uw f" ., fu v : 
(22) [ v;(%,2) dF,(y) 2 >) v.(%,2) dF (y), 


and again this inequality is strict unless 
(23) [ ardy) =0 for both i. 


The tests are equivalent if and only if both (21) and (23) hold. Therefore, if 
the tests are not equivalent, then at least one of the inequalities in (20) and (22) 
is strict. Using (16), (17), (20) and (22), it is now easy to verify the following 
two inequalities: 


(24) uvda; + v' Aa < 0 
(25) uvda, + u’ Aa, < 0. 


The conclusion of Theorem | is, of course, an immediate consequence of either 
of the inequalities (24) and (25). 


3. Strengthening of the result. Let Aa be the 2-vector whose components 
Aa; are defined in Theorem 1. If the two tests are equivalent, then, of course, 
Aa = 0. Otherwise, the conclusion of Theorem 1 states that Aa cannot lie in the 
set defined by Aa; 2 0 for both i, i.e. the (closed) Ist quadrant. In other words, 
Aa has to lie in the 2nd, 3rd or 4th quadrant. However, the inequalities (24) 
and (25) already claim something more: Aa is not only excluded from the Ist 
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quadrant but also from the part of the 2nd quadrant where — Aa2/Aa; 2 uv/v’ 
(using (24)) and from the part of the 4th quadrant where — Aa/Aa; S uv/u’ 
(using (25)). What remains is a cone of angle <x. We shall show now that we 
can sharpen the bounds w/v’ and uwv/u’ for — Aa2/ Aa to u and v, respectively. 
This will be the content of 

THeEeoreM 2. Under the same conditions as in Theorem 1 we have 


(26) uday + Aae < 0 
(27) vada, + Aa, < 0. 


Before proving Theorem 2 we will indicate some of its consequences. Consider 
u’, v’ fixed and u, v varying, subject to u < u’ < v’ < v. Consider all possible 
Aa. The cone given by (26) and (27) to which Aa is restricted depends on u and 
v. To obtain a fixed cone we remark that — Aa:/Aa; < u implies — Aaz/ Aa, < wu’ 
and — Aa:/Aa, > v implies — Aa,/Aa; > v’. Therefore, (26) and (27) imply 


(28) u'’ Aa, + Aas < 0 
(29) v’ Aa; + Aag < 0. 


The inequalities (28) and (29) are less sharp than (26) and (27), but they do 
represent a fixed cone within which Aa is restricted as u and v vary. In fact, this 
cone is the union of all cones given by (26) and (27) as u and v vary. 

We can also consider the a; — a, plane and see what happens to the vector 
of error probabilities as u’, v’ is fixed and u, » vary. The only portion of the plane 
which needs to be considered is the triangle a; 2 0, a, + a, S 1. Let a; = a;(u, v) 
a; = a;(u’, v’), and let a = (a1, a), a = (a1, ay). The inequalities (28) and 
(29) say that a lies in a cone with vertex a’, containing the point (0, 0), and 
bounded by two lines with slopes —u’ and —v’. This cone does not contain any 
point of the triangle with vertices a’, (0, 1) and (1, 0). To see this we only have 
to look at the slopes of the lines connecting a’ with (0, 1) and (1, 0). The first 
is —(1 — ay) / ars, the second —a;/(1 — a;). Now, using (2), we have (1 — az)/ 
a, 2 v' > w’, and, using (3), a,/(1 — a;) Sw’ < v’, which establishes the fact 
mentioned. Thus, a cannot lie in the closed triangle with vertices a’, (0, 1) 
and (1,0). 

There is another consequence which is of enough interest in itself to state 
separately. We introduce the quantities 


(30) 8: = m/(1 — a) 
(31) 8B, = an/(1 — a) 


The quantities 8; are defined in the same manner in terms of the a;, and AB; = 
8; — B;. Then 8; is the tangent of the angle that the line through a@ and (0, 1) 
makes with the a-axis; 6: has a similar interpretation. The result of the pre- 
ceding paragraph, namely that a is excluded from the closed triangle with 
vertices a’, (0, 1) and (1, 0), is then seen to be equivalent to 


(32) Aa, <0= As, < 0 
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(33) Aa. < 0 => As: < 0. 


This result can also be stated as 

Coro.uary 1. Under the same conditions as in Theorem 1, at least one of the 
As; must be < 0. 

Now Aa, < 0 is in particular satisfied if u = u’, and Aa, < Oifv = v’. Using 
this, and referring to (32) and (33), we have 

Corouiary 2. Let the 8; be defined by (30) and (31). If the lower stopping bound 
u of a sequential probability ratio test is fixed, 8, is a monotonic non-increasing 
Junction of the upper stopping bound v. The function is strictly monotonic except 
in any point v for which there is a v* > v such that the tests (u, v) and (u, v*) are 
equivalent. A completely analogous statement for 8, is obtained by fixing v and de- 
creasing U. 

Finally, we remark that Theorem 2 can be generalized slightly. So far we have 
considered only sequential probability ratio tests whose continuation region is 
an open interval. We can also consider a sequential probability ratio test whose 
continuation interval contains one or both of its endpoints. In Theorems 1 and 
2 we shall then consider tests with continuation intervals J and J’, where J has 
endpoints u, v, and J’ has u’, v’, and such that 7’ C J. With this generalization 
the conclusion (26) and (27) remains valid, except that one of the inequalities 
may be an equality. 

We proceed now with the proof of Theorem 2, which starts with (24) and (25). 
Notice that for very small changes from wu’ to u and v’ to v we have almost u/u’ = 
1 and v/v’ = | so that then (26) and (27) follow approximately from (24) and 
(25), respectively. The idea of the proof is to link the tests (u’, v’) and (u, v) 
by a chain of intermediate tests, each of which is close to the next one. 

Proor or THEOREM 2. Consider the chain of tests (u’, v’), (u,v) (u,m),-**, 
(u, v,) in which v, = v and »,, «++ , Up; is a sequence to be specified later. Put 
(Aai)o = a,(u, vo’) — a,(u’,v’) and (Aa), = a;(u, m%) — a(u, %1),k = 1,---, 
n, where we identify v with v’. In passing from (u’, v’) to (u, v’) we have (Aa) 2 
0, with strict inequality unless (u’, v’) and (u, v’) are equivalent. Consequently, 
using (25) in the second inequality, u(Aa:)>o + (Aar)o S u(v/u’)(Aa)o + 
(Aee)o S 0, with equality if and only if (u’, v’) and (u, v’) are equivalent. Then 
there exists «& > 0 such that for all « with 0 < « < «@ we have 


(34) u(Aay)o + (1 — €)(Aaz)o S O 


with the same remark about equality as before. For fixed «,0 < « < «&, choose 
v1, °** , Ua. in such a way that 1 — « < m4/m <1,k = 1,---,n. In passing 
from (u, v4_,) to (u, %) we have (Aa:), 2 0 so that u( Aa), + (1 — €)(Aae), S 
u(May)e + (0e-1/m%)( Aan), . The right hand side of the last inequality is <0, 
by (24), with equality if and only if (u, »%_,) and (u, »%) are equivalent. We have 
established now 


(35) u( day), + (1 — €)(Am), S 9, b= @&1.--: 


(for k = 0 this was established as (34)). In (35) there is strict inequality for 
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at least one k, otherwise (u’, v’) and (u, v) would be equivalent. Adding the 
inequalities (35) fork = 0,1, --- , n yields 


(36) ua, + (1 — €)Aa, < 0. 


Letting « — 0 then gives the desired result (26). Inequality (27) is proved 
analogously, using a chain of tests (u’, v’), (u’, v), (wm, 0), -** , (Un, 0), with 
Un = Ul. 
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LOCALLY MOST POWERFUL RANK TESTS FOR TWO-SAMPLE 
PROBLEMS' 


By Htrorumi Uzawa 
Stanford University 


1. Summary and Introduction. In order to solve nonparametric statistical 
problems, it is often found useful to apply those criteria of optimality which are 
employed in parametric problems. In the present paper, we are concerned with 
nonparametric two-sample problems of testing the null hypothesis that two 
populations have the same distribution against certain nonparametric alterna- 
tive hypotheses, and generalize the parametric optimality conditions of locally 
most powerfulness. A rank test for a two-sample problem in this paper is called 
locally most powerful if it is locally most powerful against a one-parameter 
family of alternatives. A criterion is constructed by which it is possible to solve 
the problem whether or not a given rank test is locally most powerful for a two- 
sample problem in which the set of all possible pairs of cumulative distribution 
functions is convex and closed (in the weak* topology). 

Let X,, --+ , Xq, be sample elements from the first population, X,,4,, °°: , X, 
(n = m, + nm) from the second population, and the statistic’ Z; = 0 or 1, accord- 
ing to whether the jth smallest observation is from the first population or the 
second, j = 1, ---,m. 

It will be shown that any locally most powerful rank test has the following 
form: 


Reject the null hypothesis if >> a,;Z; > c 
3 
c 
Accept the null hypothesis if }- a,Z; < c 
3 


where a; , --~ , a, are constant numbers. For the two-sided two-sample problem, 
any rank test of the form (*) is locally most powerful. For the one-sided two- 
sample problem, a non-trivial rank test of the form (*) is locally most powerful 
if, and only if, 


= L/CGIEGH eno. semana 


where 


ga t(a+--- $a); 
n 
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have all non-negative Hankel determinants (the precise definition of the Hankel 
determinants is given in Section 8 below). 

For the symmetric two-sided two-sample problem, a non-trivial rank test of 
the form (*) is locally most powerful if, and only if, 


Zz (951) (ue a) = 0, forj = 1, --+,n. 
s=j+l J 
Finally, it will be shown that for the two-sample problem, in which the alterna- 
tive hypothesis is that the expectation of the first cumulative distribution func- 
tion with respect to the second distribution is not less than 3, a non-trivial rank 
test of the form (*) is locally most powerful if, and only if, 


ger (" +1 a 
2 


2. Two-Sample Problems. Suppose that there are two statistical populations 
with cumulative distribution functions F(z) and G(r), -~x» <2z2< +0.A 
two-sample problem is concerned with testing a null hypothesis Ho against a 
certain alternative hypothesis H, based upon the observation of finite random 
samples X,,---, X,, and X,,4:,-°--, X, taken from populations F and G, 
respectively. We will confine ourselves here to the cases in which the sizes n, and 
Ng = n — m of random samples are fixed. 

In the present paper, our main interest will be in the following two-sample 
problems: 

ProsieM (1): Two-sided two-sample problem. Test the null hypothesis H» 
that two populations F and G have the same distribution: F = G, against the 
alternatives H, that two populations have different distributions: F # G. 

ProsieM (II): One-sided two-sample problem. Test the same null hypothesis 
H, against the alternatives H, that the first population F is statistically smaller 
than the second population G: 





s) tans ie a,) = 0. 


s=l 


F >G. 


Prosiem (III): Symmetric two-sided two-sample problem. Test the null 
hypothesis H, that F andG are symmetric and identical against the alternatives 
H, that the two populations are both symmetric with the same median but are 
different. 

ProsiemM (IV): Test the null hypothesis H» against the alternatives H, that 
two populations have different distributions and the mean of the first F with 
respect to G is not greater than 4: 


FxG and [ FaGes. 
For the sake of simplicity, we use the following notation: For two functions 


F and G, 
F=G if F(x) = G(x) forall z, 
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F2G if F(z) 2 G(z) forall z, 
F>G if F2G but F #G. 


It will always be assumed that F (x) is strictly increasing and continuous on— «= < 
r<t+o. 


3. Rank Tests. Let X,, --- , X,, , and X,,4:, --- , X, be two random samples 
taken from populations F and G, respectively. We will confine our attention to 
rank tests which may be defined conveniently in terms of the following Z statistics 


(0, if the ith smallest observation among 
Z; =4 X,,---, X, comes from population F 
\1, otherwise, 


j - l, 77 ae 
then any non-randomized rank test ¢ may be expressed by 


(0, if T(Z1,-°+, Ze) <e 


(1) (Xi, °°:, Xa) = } 4 

\l, if T(Z,---,Z.) >, 
where T(z, ,--- , Zn) is a function defined on z = (z,,--+, Zn) with z; = 0 or 
1,j7 = 1, --+ ,n, andc isa constant. ¢(X,, --- , X,) is the probability of reject- 
ing the null hypothesis Hy under observation X,,--- , X,. We denote by ¢r 


the test ¢ defined by (1). 

If T(z) is a constant function of z, the test ¢- is trivial. Two rank statistics 
T(z) and T’(z) define the same rank test if 
(2) T’(z) = AT(z) + 8 forall z 
with positive \ and arbitrary 8. 

In what follows, we are interested only in non-trivial rank tests, and two 


statistics T’ and T satisfying (2) may be considered as identical. 
The size a,, of test dr is given by 


(3) aor = 2 dr(2)P(z| F, F) 
and the power function 87(F, G) may be expressed as 
(4) Br(F,G) = 2) dr(z)P(z| F, @), 


where P(z | F, G) represents the probability of Z = z when F andG are true dis- 
tributions, and the summation ie is over all z = (4% ,-+-, 2.) with z; = O or 
1 such that >>. 2; = m. P(z| F, @) may be expressed as follows: 


(5) P(|F,G)= mim! [--- f TT dtr (u)'"G(u)" 


—w< us Sta<to 


Since F is assumed to be continuous, (5) may be written: 


(6) Ple|F,G) = P(z,H) = mim! f--- f TT as (4)"9 
OStiS---Stagt 
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where 
(7) H(t) =G([F"(t)], Osts1. 


H(t) is a cumulative distribution function on [0, 1]. We shall denote by 2 the 
set of all possible H’s associated with any given two-sample problem, i.e., 


(8) Q = {H;H = GF", (F,G) e¢ Ho or Hy}. 


The sets 2 corresponding to the two-sample problems mentioned in Section 2 
are as follows: 

Prospiem (I): Q, is the set of all cumulative distribution functions H over 
{0, 1]. 

ProsiemM (II): Q is the set of all cumulative distribution functions H over 
(0, 1] such that H(t) s tforallO0 Sts 1. 

ProsiemM (IIT): Q, is the set of all symmetric cumulative distribution functions 
H over {0, 1): 


H(t) + H(1 —?t) = 1, forallO Sts 1. 


Prosiem (IV): 0 is the set of all cumulative distributions H over (0, 1] with 
mean not smaller than 4: 


[ tan = 3. 


The set 2 in any problem of the above type has the following properties: First, 
2 is a convex set; i.e., H; , H2e 2, and0 S \ S 1 imply AH, + (1 — ADAM € Q. 
Secondly, for a sequence H, , H;, --- of distributions in Q, the condition that 


[ f(t) dH(t) = lim [ f(t) dH,(t) 


for any continuous function f(t) on [0, 1], implies that the distribution H also 
belongs to the set @. This last property is sometimes stated that the set © is 
closed in the weak* topology.’ 


4. Locally Most Powerful Rank Tests. A set of cumulative distribution 
functions { (F(x, @), G(x, @)):0 S @ S 6}, where 6 > 0, is called a one-parameter 
family of alternatives if the foll ving conditions are satisfied : 

(a) (F(2, 6), G(a, @)) e Hy, for0 < 6 s 8, 

(b) F(z, 0) = G(z2, 0), 
and 

(c) H(t, 0) = G[F"(t, @), 6] is uniformly differentiable with respect to @ at 
6 = 0. Here H(t, @) is called uniformly differentiable at 6 = 0 if the con- 
vergence, as @ tends to 0, of [H(t, 0) — H(t, 0)|/@ to [AH (t, @)/6\s~ is uniform 
with respect to ¢. 

A rank test ¢r is said to be locally most powerful if there exists a one-parameter 
family of alternatives (F(z, 6), G(z, @)) such that $7 is most powerful against 


2Cf., e.g., Bourbaki [1]. 
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the alternatives (F(z, 6), G(x, @)),0 < @ < %&, for some positive number 6p , i.e., 


(9) Bo,y( F(z, 6), G(z, 6)) 2 B4( F(x, 6), G(2, 6)), 0 < 6 < ih ’ 
for any rank test @ with size a. 

The calculation of locally most powerful rank tests will be done by the following 
theorem: 

Tueorem 1: The locally most powerful rank test or against a one-parameter 
family (F(a, 6), G(x, 6)) is determined uniquely for each size of test defined by 


(10) T(z) = Ye aszy, 
where 
11) " -G- ') [ea — Hr" dQ), J=il,--'yn, 
(12) Q(t) = [ oa 2! 
(13) | H(t, 0) = GIF" (t, 6), 6). 
Proor: Let us define P(z, 6) by 
(14) P(z, 0) = P(z| F(-, @), G(-, @)). 


Then, by the Neyman-Pearson Lemma, any rank test ¢ is locally most powerful 
against (F(-, 0), G(-, 6)) if, and only if, ¢ is defined by the statistic 


_ | dP(z, @) 
(15) rz) =| a hs 


On the other hand, differentiating (6) with respect to 6, and noting that 
H(t, 6) is uniformly differentiable at 6 = 0, we have 


(16) [Pe _, = mina! [--] dD 2; dty «++ dtj1 dQ,(t) dtjys +++ dt, 


j=l 
05'15°**Stasl 


where Q(t) is defined by (12). Since 








1 [1 
J++ [a+ dis = Gay 
OSt1S-**Stj-18'; 
and 


1 
eee eee = ie al ais \ens 
/ [ats dt,, jl (1 — t) 
7Sti418° Stns! 
the integral in (16) may be further simplified and we have 


[Ps 2 
dé =~ 


onint SQyvG= ws pif 1 — 4)" dQit). 


j=l 


(17) 
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The relation (17), together with (15), proves the theorem. Q.E.D. 

Theorem | easily implies the foliowing 

Corouuary: Any locally most powerful rank test is admissible. 

In what follows, we shall first investigate locally most powerful rank tests for 
a general two-sample problem in which the set Q is convex and closed in the weak* 
topology, and obtain a criterion for a rank test to be locally most powerful. We 
then consider the class of all locally most powerful rank tests for various two- 
sample problems mentioned above. 


5. The Set A: A Special Class of Locally Most Powerful Rank Tests. Let us 
now consider a general two-sample problem for which the set Q of all correspond- 
ing H functions is convex and closed in the weak* topology. In this section we 
shall introduce the class of rank tests which are locally most powerful with 
respect to a special class of one-parameter families of alternatives. 

Let H be an arbitrary distribution function in 2, and consider a one-parameter 
family (F(z, 0), G(x, @)),0 < 6 < 1, satisfying the condition that‘ 


(18) H(t, 0) = (1 — 6)t + OH (t), 
Ssta 1, 06 86 
where 


H(t, 0) = G[F*(t, 8), 6). 


By Theorem 1, a rank test ¢r is locally most powerful against the one-parameter 
family satisfying (18) if, and only if, 


(19) T(z) = vd. a(H)2, + B, 


where \ is a positive number, 8 an arbitrary number, and 


1 
(20) aj(H) = (" ai ) [ "(1 — t)*’ dH(2), j=1,-:: 
— 0 


We shall define the set A of n-dimensional vectors by 

(21) A = {a = (a, +++ ,@n); a; = ha;(H) + 8,j = 1,--+ ,n,HeQ,rX.2 0, 
and § is an arbitrary number}. 

The set A, in other words, consists of all vectors that describe rank tests 
locally most powerful with respect to one-parameter families satisfying (18). 
It may be noted in particular that 
(22) a;(Hy) = -, j7=0,1,---,n-1, 
where 

H(t) =t,0<t<s1. 


4 The case in which H(t) is a polynomial was considered by Lehmann [5]. 
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It will first be seen that the set A is a closed convex set. For any H,, H,; ¢ Q, 
and 0 S A S 1, we have a[(1 — A), + AM) = (1 — A)a( AM) + Aa( Ag), which, 
together with the definition (21) of A, implies that the set A is a convex cone. 

In order to prove the closedness of the set A, let {(a;, --- , a,)} bea sequence 
of n-vectors in A which converges to an n-vector a° = (a},--- , a4). Since a’ 
is in A, there exist H” ¢ 2, \” = 0, and @” such that 


(23) aj = Na, H") + 8,7 = 1,---,n, v= 1,2,---. 


Taking a suitable subsequence of {H"}, if necessary, we may without loss of 
generality suppose’ that for any continuous function f(t) on (0, 1), 


1 1 
(24) lim [ f(t) dH’(t) = [ f(t) dH(t), 


with some distribution function H(t) over [0, 1]. Since Q is closed in the weak* 
topology, the function H belongs to the set 2. 

If the sequences {\"} and {8”} are bounded, then, for any limiting points \° and 
8’, we have, by (23) and (24), a} = \°a,(H) + @°,j = 1, --- , n, which shows 
that vector a belongs to the set A. If both sequences {)’} and {8”} are unbounded, 
then we have H(t) = ¢, for all ¢. Hence, vector a trivially belongs to the set A. 


6. The Set B. In order to investigate further the structure of the set A, we 
now introduce linear transformation L which maps n-vector a = (a,,--~- , ds.) 
to b-vector b = (bo, b; , --- , ba.) defined by 


L(a) = (Lo(a), L,(a), Brey L,-«(@)), 


where 


ca whe) = BC IIT)/[a/ GTi) Oe me 


The inverse linear transformation L™ of L is defined by 


L“(b) = (Ly7"(b), --- , Lz*(b)), 


where 
n—l . 
“1 (h) as _1)- #1 m7 ) Fee fees n, 
(26) L;* (b) 2, 1) C2 72 gj=lereyn 


The fact that the linear transformation L~ defined by (26) is the inverse of L 
defined by (25) is easily seen from the following identities: 


(27) v= > (54 ea art, j =0,1,---,n—1, 
a=j+l st-j-i 
and 


n—l . 
(28) 1 - t= (rt * 4 i), gj=ljerryn 


a=j—l 


“*Cf., e.g., Bourbaki [2]. 
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Let us now define the set B as the image of the set A by the linear transforma- 
tion L: 


(29) B = {b = (bo, +++, bas): 6 = Lia) ftorsome ae A}. 


Since the set A is a closed convex cone and L is linear, the set B again is a closed 


convex cone in the n-vector space EZ”. It is noted that we have, by the identities 
(27) and (28), 


1 
(30) L,(a(H)) = [ Ble: chen licn «ent, 
0 


for any H ¢« Q. We have, in particular, that 
(31) L,A,---,1) =n/G+1), jg =0,1,---,n-—1. 


The definitions (21) and (29) of the sets A and B, together with (30) and (31), 
imply that an n-vector b = (bo, --- , bas) belongs to the set B if, and only if, 
there exist H ¢€Q, X 2 O and real number § such that 


1 
(32) b=af t’ dH(t) + 8/(j + 1), J =O0,1,---,n—1. 
0 


We shall give a necessary and sufficient condition for an n-vector b = 
(bo , «++ , bna) to be in the set B. 

We first define the polar cone B* of any set B of n-vectors b = (bo, «++ , Da) 
as the set of all n-vectors y = (yo, --- , Ya.) Whose inner product with any 
vector in B is non-negative: 

(33) B* = ly = (yo, +++, Yn): yb 20 forall be Bi, 


where y-b denotes the inner product of two vectors y and b: 
n—l 
y-b = Do yids. 
j=0 
For any n-vector y = (yo, -** , Yn-1), let us define the polynomial y(t) by 
n—! 
(34) y(t) = > yf’. 
j=0 


By the definition (33), and the relation (32), an n vector y = (yo, °** , Ya—1) 
belongs to the set B* if, and only if, 


n—l 


1 
(35) > Wi yi t’ dH(t) + 8/(j + | > 0 
0 


j=O 


for all H e Q, X = O, and 8 real. 
The relation (35), in view of (34), is equivalent to the following: 


1 1 
[ y(t) dH(t) = 0, for all H e@,and [ y(t) dt = 0. 
0 0 
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Therefore, we have 
Lemma 1. An n-vector y = (yo, *** , Yas) belongs to the set B* if, and only ¢, 


1 
(36) [ ww dH(t) 2 0, forall He, 
0 


and 
1 
(37) [ y(t) dt = 0. 


On the other hand, since the set B is a closed convex cone in the n-vector space 
E”, we have, by the duality theorem® on closed convex cones, that 


(38) B** = B. 


The relation (38) may be expressed as 
LemMa 2. An n-vector b = (bo, --+ , bas) belongs to the set B if, and only if, 


b-y 2 0, for all ye B*. 


7. The Two-Sided Two-Sample Problem. We shall first consider Problem (I): 
Test the null hypothesis H, : F = G against the alternatives that H,: F # G. 
In this case, the set 2, consists of all cumulative distribution functions H over 
{0, 1). 

The class of all locally most powerful rank tests is characterized by the follow- 
ing theorem: 

THEOREM 2: A non-trivial rank test or is locally most powerful for Problem (1) 
if, and only if, 


(40) T(z) = & az;, 


where a, +--+ , a, are arbitrary constants. 

Proor: It will be shown first that the set B* consists of the zero vector 0 = 
(0, --- , 0) alone. Indeed, let an n-vector y = (yo,--- , Yas) belong to B*. 
By Lemma 1, the conditions (36) and (37) must be satisfied, where Q, is the set 
of all cumulative distribution functions H on [0, 1]. Then the condition (36) 
implies that 


(41) y(t) 20 forall Osts1. 


But since y(t) is a polynomial, the relation (36), together with (37), implies that 
y(t) = 0, for allO s ¢ S 1. Hence, y; = 0,7 = 0,1,---,n— 1. 

The polar cone B** of B* = (0) is now the set E” of all n-vectors. By Lemma 
2, therefore, we have B = E”. Hence, the set A also is equal to the set EZ” of all 
n-vectors, and any test @r defined in terms of T(z) of the form (4) is locally 
most powerful. 


* Cf., e.g., Bourbaki [1] or Fenchel [3]. 
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8. The One-Sided Two Sample Problem. In this section we will be concerned 
with Problem (II): Test the hypothesis Hy : F = G against the alternatives that 
H,:F > G. The space Q, for this problem consists of all cumulative distribution 
functions H on [0, 1] such that 
(42) H(t) st, Ositsl. 


Before stating the characterization of the class of all locally most powerful 
rank tests for Problem (II), we introduce some concepts from the Hausdorff 
theory of moments.’ 

An m-vector c = (¢o,¢;, °** , m1) may be called here a solution to the m-di- 
mensional moment problem over [0, 1] if there exist a distribution function H 
over [0, 1] and a non-negative number A such that 


1 
(43) g=rf van(y), j=0,1,---,m-—1. 
0 


Let C,, be the set of all solutions to the m-dimensional moment problem over 
[0, 1}: 


( 


C. = { C= (€9,€1,°°* »Cmna) iC; = A / t’ dH(t), 
(44) ae , 
j = 0,1,--- ,m — 1, for some distribution H over (0, 1) | 
and non-negative number ) { * 


Similar to the set B, the set C,, here is also a closed convex cone in the m-vector 
space. 
The polar cone C%, may be written as 


1 
Ci = z= (2,°°* ,Zmi): I z(t) dH(t) 2 O for all distributions H \ 
(45) \ 0 J 


{jz = (2o,°++ ,Zm-1): 2(t) 2 O, forall OS ¢ S 1}, 


where z(t) is defined by z(t) = >°%— z,t’. Hence, by the duality theorem on 
closed convex cones, we have that 

Lemma 3. For an m-vector c = (co, «++ ,Cm1), C€ Cm tf, and only if, c-z 2 0, 
for allz = (a ,°-++ ,Zm+4) such that z(t) 20,0 Sts 1. 

By a theorem from the Hausdorff theory of moment problems,” we have, on 
the other hand, that an m-vector c = (co, «++ , Cm-1) 18 a solution to the m-di- 
mensional moment problem over [0, 1] if, and only if, the Hankel determinants 
d.(co, +++, ¢) and A,(eo,--- , ¢,) are all non-negative, for all s = 0, 1, 
m— 1. 


— 


7 For the Hausdorff theory of moments, the reader is referred to, e.g., Shohat and Ta- 
markin [7] or Karlin and Shapley (4). 
* Cf. Karlin and Shapley [4], pp. 54-57. 
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The Hankel determinants Q.(¢o,--- , ¢.) and A,(¢o, +--+, ¢,) are defined by 


Co Cy *** C 
Der(Co, +++ Cr) =|: 
Cr Cran *** Car | 


C1 Ce °° Cray 

Dorsi(Co,*** 5 Cora) = 

| Crat Crag *** Cor+1 

Ci—- Ce & — (2°: @& —~ Crt 
Aa(Co , eee , Cor) = : 


| Cr — Cra Crea — Cree *** Copa — Car 


om CL Cpt Cy — Cras | 
Darsi(Co , + Corgi) = : | 
| Ce = Cra Crt — Crag °** Cop = Corti 

The class of all locally most powerful rank tests for Problem (II) may now be 
characterized by the following: 

THEOREM 3. A non-trivial rank test or is locally most powerful for Problem (11) 
if, and only if, T(z,,-*-, 2) = D.Jaragy, and (co, C),°+*, Cn-2) has all 
non-negative Hankel determinants, where 


Z n— 2") (s+1 ms TS eee 
Gg = [1 /( j > el (Ge42 a), j= 9, ,” 2, 


e. 
@=u = 
Th emi 


(46) 


Proor: In the present case, the set 2, consists of all cumulative distribution 
functions H(t) over [0, 1] such that 


(47) H(t) st, forall OSts 1. 


By Lemma 1, an fi-vector y = (Yo, ++: , Yes) belongs to the polar cone B* if, 
and only if, 


1 
(48) [ y(t) dH(t) = 0, for all H such that H(t) $40 <¢<1, 


and 
1 
(49) [ y(t) dt = 0. 


We shall show that in view of (49) condition (48) may be replaced by 
(50) y(t)20 forall Ositsl. 


In fact, for any polynomial y(t) satisfying (49), we have, by a partial integra- 
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tion, that 


1 1 
/ y(t) dH(t) = wionion — [ y'(t)H(t) dt 
0 0 

(51) 


1 
- | y'(t)(t — H(t)) dt. 
0 


Now suppose that there exists ty) such that y(t) < 0,0 < t Ss 1. Then, by the 
continuity of y’(t), there is an interval J in [0, 1] containing & such that y/(t) < 0, 
for all t ¢ J. It is then possible to construct a cumulative distribution function H, 
over [0, 1] such that 


(t—H(t)=0, forallte/, 
t — H,(t) > 0, for ¢ interior to /. 


Then H, belongs to the set 2, and by (50) and (51), 
1 
[ y(t) dH,(t) = [v@ an < 0, 
0 I 


which contradicts (48) and (51). Therefore, we have y(t) 2 0, forall0 s 
The polar cone B*, therefore, may be characterized by 


B* ={y = (yo,°*: , year): y(t) 2 O,forallO St sl, 
(52) 1 
and [ y(t) dt 
0 


We may rewrite (52) as follows: y ¢ B* if, and only if, 
1 ; 
(53) ~. j= l,--- 


and 


n—l 


(54) yw=-D 


j=l J 
* 
for some z = (2, °-* , Zn-2) € Can. 


By Lemma 2, (53) and (54) imply that b = (bo, bi, +--+, ba) ¢ B if, and 
only if, 


n—1 
(55) Zz : (2, , ) 2zji120 forall z= (2,--+ ,Za-2) eCo-4. 


fj  j+i 
By Lemma 3, the relation (55) is satisfied if, and only if, 


56 Gs ib ABER aS 
(56) Cj  >+ ae ta), J 0, 1, ,n 2, 
have all non-negative Hankel determinants. Substituting (25) into (56), c; are 
expressed by (46). 
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We now show that any locally most powerful rank test ¢r may be expressed 
in terms of T(z) = > j.1 ajz; with a = (a,,---,a,) ¢ A. In fact, let dr be 
locally most powerful against a one-parameter family (F(z, 9), G(2, @)). By 
Theorem 1, we have T(z) = > es az; , where a; are defined by (11). Since 


H(0, @) = 0, H(i, 6) = 1, 
H(t,@) st = H(t, 0), Ostsl, 


we have 


(57) Q(0) = Q(1) = 0, 
(58) Q(t) <0, Osts1, 


where Q(t) is defined by (12). 

Let us first consider the case where Q(t) is continuously differentiable on {0, 1). 
Then, by (57) and (58), there exists a positive number A such that H,(t) = 
t + AQ(t) is a cumulative distribution function over (0, 1], for which we have 


(59) Ay(t) St, Ostsl. 


Consider a one-parameter family (F;(2z, 0), G(a, 6)) satisfying 
(60) Hy(t, 0) = (1 — 6)t + 6H,(t). 
H,(t, 6) belongs to the set 2, for 0 S @ S 1, and 


9H (t | 
( weak’, <2 om ). 
61) | Fy} et rAQ(t) 


Therefore, >>; a,z; is locally most powerful against the one-parameter family 
(F(z, 6), Gila, @)) satisfying (60), and a = (a, --- , a,) belongs to the set A 
for Problem (II). 

Now consider the general case where Q(t) is not necessarily differentiable. 
Let |H,(t, 0); » = 1,2, --- | be a sequence of cumulative distribution functions 
in 2 such that 


(62) lim Q,(t) = Q(t), 0Osts il, 
where 


; _ | OH,(t, 6) 
(63) Q(t) = [see - 


is continuously differentiable with respect to t. The locally most powerful rank 
order test against one-parameter family (1 — 6)t + 6H,(t) is defined by 


(64) T’(z) = > ajz; 
jul 
where 


1 
(65) a; = (" ¢ iE v"(1 — 1)" dQ,(t). 
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Then the relations (62) and (65) imply that 
(66) lim aj = a,;, j=1,---,b. 


ron 


The vector a” = (a}, --- ,a,) belongs to A which is a closed set. Hence, by (66), 
we have a = (a, °°: ,@n) €A. 


9. The Symmetric Two-Sided Two-Sample Problem. In this section we in- 
vestigate the structure of the class of all locally most powerful rank tests for 
Problem (III). Problem (III) is to test the null hypothesis H, : F = G, sym- 
metric against the alternatives that H, : F + G, symmetric with the same median. 
For Problem (III), the set Q; consists of all cumulative distributions H over 
{0, 1] such that 


(67) H(t) + H(i —t) = 1, forall OSts 1. 


The class of all locally most powerful rank tests is characterized by the follow- 
ing 

THEOREM 4. A non-trivial rank test r is locally most powerful for Problem (111) 
if, and only if, T(2,--* , 2m) = Lijaraj2, with 


(68) > (° j ') (Qnyi-. — @) = 0, forall j=1,-++-,n. 
emj+l 


Proor: Since the set Q; consists of all cumulative distribution functions H 
satisfying (67), Lemma 1 implies that an n-vector y = (yo, --- , Ya) belongs 
to the polar cone B* for Problem (III) if, and only if, 


: for all cumulative distribution functions H(t) 
(69) I y(t) dH(t) & 0, satisfying (67), 


and 


1 
(70) I y(t) dt = 0. 


If H satisfies (67), we have 


1 1/2 
[ w@anw = [Wo + va - olaneo. 
0 0 


Relations (69) and (70) may now be replaced by 


” * for any cumulative distribution 
(71) | ly(t) + y(1 — )] dH(t) & 0, functions H over (0, 4], 


and 


1/2 
(72) [ [y(t) + y(1 — t)] dt = 0. 
0 


But relations (71) and (72) are satisfied if, and only if, y(t) + y(1 — t) = 0, 
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0 sts }. Hence, an n-vector y = (yo, ~*~ , Yn-1) belongs to the polar cone B* 
if, and only if, 
(73) y(t) + y(1—t) =0 forall OsSstsl. 
The relation (73) may be written 
n—l 
(74) wt (-v'(‘)n=o, j=0,1,---,n— 1. 
om) 


By Lemma 2, the set B is equal to the polar cone B** of B*. Therefore, by (74), 
an n-vector b = (by, --+ , bas) is in B if, and only if, 


(75) b= w+ (-1"(%)u, j=0,1,-:-,n—1, 
e=0 


for some u = (to, -°** , Up—4). 

It may be noted that, for an n-vector b = (bo,--- , bas), there exists an 
n-vector u = (to, --* , Uns) for which the relations (75) are satisfied if, and 
only if, 


(76) b= dc-w(2)e, j=0,1,---,n—1. 
=O 8 


It is evident that the relation (75) implies (76). On the other hand, let the 
relation (76) be satisfied. In order to prove the existence of uo, --- , Un_, satisfy- 
ing (75), we use the mathematical induction on n. Let us assume that we have 
found te, --* , Us—2 Which satisfy the relations (75) for 7 = 0, 1,---,n — 2. 


If n — 1 is an even number, u,_, may be determined by 


2uea1 = b..1 — z (—1)’ oe ') Uy - 


Up, *** , Une and u,_, then satisfy (75) for s = 0,---,n — 1. 
If n — 1 is an odd number, the relation (75) may be satisfied with m,--- , 
up, and arbitrary u,_, . In fact, by (76), 


n—l n-—2 
ace oe (-'(” = Yo, = -batd (-v'("7 Yb. 


a= 


Hence, 


2b,-1 -E-'(" os ') [ + > (-v"(*) u| 
-¥(-1'(*= ute Eve ("7 ) (!)u. 


=f) . 8 r—O =r 


(77) 


But 
< roa(rn — 1\ (8 
z-0("> ')() 
n—- 1\*<"* e(n-—-r-l wri f(/n— 1 
-( ; VE cv'( k )=-(-p ( : ). 
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Since n — 1 is odd, we may now write (77) as follows: 


(78) 2b, —1 = 25 (-1)' ("7 ‘) Uy. 

raf) 
Dividing (78) by 2 implies the relation (75) for 7 = n — 1. Expressing (76) in 
terms of a, ,--- , a, , we get the relation (68). 

We now have to show that the set A actually exhausts the class of all locally 
most powerful rank tests for Problem (III). Let T(z) = > a,z; define the 
rank test which is locally most powerful against one-parameter family (F(z, @), 
G(x, 6)) such that 


(79) H(t, 0) + H(i —t, 6) = 1, Ost i. 


Since the set A is a closed set, it again suffices to consider the case in which 
Q(t) = [dH (t, @)/A6)»~o is continuously differentiable. 
By (79) we have 


(80) Q(0) = Q(11) = 0, Q(t) + QC —- t) = O, O<ts1. 


Since Q(t) is continuously differentiable, there exists a positive number \ such 
that H,(t) = ¢t + AQ(t) is a symmetric cumulative distribution function over 
[0, 1]. Hence, for some constant 8, we have a; = \a;(H,) + 8, which shows that 
a = (a@,-*-,4,) belongs to the set A for Problem (III). 


10. The case {} F dG = }. We shall finally consider Problem (IV): Test the 
null hypothesis Hy : F = G against the alternatives that H, : F # G, fo F dG = }. 
The set Q, for Problem (IV) is the set of all cumulative distributions H over 
[0, 1] such that 


al 


(81) | tdH(t) =}. 
0 


TuHeorem 5. A non-trivial rank test or is locally most powerful for Problem (1V ) 

if, and only if, T(z) = >-f.1 ajz; with 
{(m+1)/2] 
(82) 3 ((n + 1)/2 — 8)(anuis-. — a,) 2 O, 
s==l 

where [(n + 1)/2] denote the greatest integer less than or equal to (n + 1)/2. 

Proor: The polar cone B* in the present case consists of all n-vectors y = 
(yo, *** » Yaa) Such that 


1 1 
(83) | y(t) dH(t) = 0, for all H for which [ tdH(t) = 4, 
0 0 


and 
l 
($4) [ y(t) dt = 0. 
“0 


The set Q, in particular contains the set of all cumulative distribution functions 
H such that H(t) < t, forO0 s t S 1. Therefore, by an argument similar to the 
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one in Problem (II), we have 
(85) y(t} 2 0, 0sitsl, forall ye B*. 


Similarly, since the set Q, contains all cumulative distribution functions H over 
{0, 1] such that H(t) + H(i — t) = 1,0 St S 1, we have that 


(86) yt) +yi-t)=0, Osts1, for all ye B*. 
We shall show furthermore that, for any y ¢ B*, 
(87) y(t) islinearin ¢, Ostsl. 


In fact, let ¢ and r be arbitrary numbers between 0 and }, and let H,,, be the 
cumulative distribution function corresponding to the following probability 
distribution: 


Prob. {t = 4 — b} tr/(ao +7), 
Prob. {t } + 7} a/(o + 1). 


The cumulative distribution function H,,, belongs to the set Q, for Problem 
(IV), and 


1 
tr o 
[ y(t) dH, ,(t) = oa ~y (— a) +594 +r). 
Therefore, if y = (yo, «-*, Ya.) € B*, the relation (83) implies that 


(88) —y(4 — o)/o S (4 + 1)/r, 
forall O< o,7r < 4. 


The relation (88), together with (86), implies that y(t) be linear in ¢,0 st s 1. 
Therefore, if y belongs to the polar set B*, we have, by (85), (86), and (87), 


(89) yo = —tn, “wn 2 0, Ye= + = You = 0. 


On the other hand, let an n-vector y = (yo, -*-, Yn—1) satisfy the relation 
(89). Then, for any H ¢%, 


1 1 
[ y(t) dH(t) = yo + wf tdH(t)=> w+ 4m = 0, 


1 
[ va =wtin= 


The vector y, therefore, belongs to the polar cone B*. Hence, the polar cone B* 
consists of all n-vectors y = (yo, -**, Yn—1) satisfying the relation (89). The set 
B is, by Lemma 2, equal to the polar cone B** to B*. Thus, we have 


(90) b = (bo, ---, ba) eB if, and only if, b; = 4 be. 


Writing the relation (90) in terms of a’s, we have that a = (a, -+-,@,) € A 
if, and only if, (82) is satisfied. 
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Let ¢r be locally most powerful against a one-parameter family (F(z, 6), 


G(x, 6)). By Theorem 1, we have T(z) = ) J. ays, where a, are defined by 
(11). Since 


1 1 
H(0)=0, H(i) =1, [ tdH(t,0)24=f taH(t,0), Ost), 
0 


we have 


(91) Q(0) = Q(1) = 0, [ tdQ(t) = 0. 


It again suffices to consider the case where Q(t) is continuously differentiable 
at every point ¢. By (91) there exists a positive number A such that H,(t) = 
t + AQ(t) is a cumulative distribution function over [0, 1] and 


[tan =4+2/ taQce) >}. 


Consider a one-parameter family (F,(2, 6), G(x, 6)) satisfying H,(t, 6) = 
(1 — 6)t + 0H,(t). H(t, 6) belongs to the set 2, for 0 S @ S 1, and, for some 
number 8, 


dH (t, @) ie 
[see | = AQ(t) + 8B. 


The vector a = (a, --+, a,), therefore, belongs tothe set A for Problem (IV). 
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ON THE STRUCTURE OF DISTRIBUTION-FREE 
STATISTICS! 


C. B. Bet? 


Stanford University and San Diego State College 


Introduction and Summary. Let X,, X:,---, X, be a sample of a one- 
dimensional random variable X which has the continuous cumulative probabil- 
ity function (epf) F. It has been observed that the distribution-free statistics 
commonly appearing in the literature can be written in the form #[F(X,), 
F(X), --- , F(X,)] where © is a measurable symmetric function defined on the 
unit cube. Such statistics are said to have structure (d). 

Birnbaum and Rubin [12] have proved that for the family Q*, of strictly mono- 
tone continuous cpf’s, statistics of structure (d) possess a property stronger than 
that of being distribution-free. 

The purpose of this paper is to study the extension of the Birnbaum-Rubin 
(B-R) result to other classes of cpf’s and to present a different approach to 
these results. It is found that a one-sided extension of the B-R result is valid 
for all properly closed, symmetrically complete classes of cepf’s. Then, from the 
existing literature on completeness, one can conclude that the extension is valid 
for several other classes of statistical interest. 

The relation between statistics of structure (d) and strongly distribution-free 
statistics (Section 1) is of importance for two reasons. First of all, if one is de- 
signing distribution-free tests, the results here and in [12] guarantee that if 
one chooses a statistic of structure (d), one has a strongly distribution-free 
statistic for several large classes of cpf’s. 

On the other hand if one has a strongly distribution-free statistic, the results 
guarantee that it is of structure (d). Hence, its epf can be written as the volume 
of a polyhedral region in the n dimensional unit cube. Under such circumstances 
the work of Smirnov [20], Feller [13], Anderson and Darling [4], and Birnbaum 
[9] indicate that it should be possible to evaluate the epf explicity; reduce it to a 
system of recursion formulae; tabulate it with the aid of high-speed computers 
or at least evaluate its limiting distribution. 

This article is divided into four sections. In Section 1 distribution-free statis- 
tics of various types are introduced. Section 2 contains some preliminary results 
concerning cpf’s. The main theorem is proved in Section 3; and Section 4 con- 
tains a survey of the known pertinent completeness results as well as a corollary 
of the main theorem. 


1. Distribution-free Statistics. Consistent with the notation of Scheffé [18] 
and B-R [12] let 
Q = the class of all epf’s; 
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the class of all non-degenerate cpf’s; 
the class of all continuous epf’s; 
the class of all strictly monotone continuous cpf’s; 
= the class of all absolutely continuous (with respect to Lebesgue measure ) 


= the class of all cpf’s with continuous derivatives; 
the class of all epf’s which are uniform within intervals [11], [12]; and 
the class of all epf’s with densities of the form 


C(O, -°°* , 0.) exp {—a2™" — 0,2 — Oyr* — -- — 0,2", 
{16}. Analogously, for the unit interval I, one defines 
%(J) = the class of all epf’s on I; 
(J) = the class of all non-degenerate cpf’s on I; 
%(I) = the class of all continuous epf’s on I; etc. 
If 2 and ©’ are two arbitrary families of cpf’s, a real-valued function 


Se = Se(X, ; 7 oe "Rr “a X«) 


will be called a statistic in Q with regard to (w.r.t.) 0’, if for every G ¢ Q, and 
F ¢eQ; and X,, X:,--- , X, in the n-dimensional sample space for a random 
variable X which has epf F, 

(a) So(X*”) = Se(X,, X2,--- , Xn) is defined everywhere in the sample 
space, and 

(b) Se = Se(X) has a probability distribution; this probability distribu- 
tion will be denoted by @,'”’ So’. 

For example, consider von Mises’ statistic 


we =n [ [F.(x) — G(x)? dG(x) = (1/12n) + > (G(X,) — (2i — 1)/nJ; 
« t= 


Kolmogoroff’s statistic 


D, = sup |F,(z) — G(z)| = max (G(Xi) — (¢ — 1)/n, (i/n) — G(X); 


—w<c ecw 


Anderson and Darling’s 


K, = _ sup | Vn| F.(x) — G(x) |\(¥{G(z)])* 


Wien i (F.(z) — G(x) MG(z)] aG(z) 


where pats) is the empirical cpf determined by the sample X,,--- , X, ; and 
z.: Xs , =6 ae are the ordered sample values. All satisfy (a) and (b) when 
Q = 0 = 2. Hence wi, D, , K,, and W% are all statistics in 2 w.r.t. 2. 

If cs a statistic Sg(X°”) in @ w.r.t. 2 there exists a (measurable) function 
# defined on the n-dimensional unit cube and symmetric in its arguments, such 
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that for any G ¢Q, F ¢ 0’, we have Se(z‘*”) = &[G(x), --- , G(r.) |[Pr), ie. 
almost everywhere in the sample space X‘”’ for the random variable X which 
has epf F, then Se(X*"’) is called a statistic of structure (d). 

If 2 = @ and S_(X‘") has the property that @{" Sq’, the probability dis- 
tribution of Sg when X has epf G, is independent of G for all G ¢ ©, then So( X*"’) 
is a distribution-free statistic in Q. 

If So(X) is a statistic in Q C O* w.r.t. some 0’, then So(X"’) is called a 
strongly distribution-free statistic in Q w.r.t. Y if @F Seo’ depends only on the 
function r = FG" for all G eQ and F eo. 

In view of the preceding definitions, it can be readily established that 

(A) if a statistic in 2 w.r.t. 2 has structure (d) then it is distribution-free 
in Qe ; 

(B) if a statistic in 2* w.r.t. Q* is strongly distribution-free, then it is dis- 
tribution-free in 2*; and 

(C) if a statistic in 2* w.r.t. Q* has structure (d), then it is strongly distribu- 
tion-free. 

Further, it is seen that each of the statistics (von Mises, etc.) in the example 
above is, for properly chosen classes of epf’s, of structure (d); strongly distribu- 
tion-free and symmetric; and distribution-free. Such also is the case for Dt and 
D, of Wald and Wolfowitz [21], and Birnbaum, [10]; the spacing statistics of 
Kimball [17] and Sherman [19]; and most of the other distribution-free statistics 
in the literature. 

Birnbaum and Rubin [12] have shown that there exists a distribution-free 
statistic which is not strongly distribution-free; but the other two properties 
always seem to occur together in a statistic. For that reason it is of interest to 
find the conditions under which the property of having structure (d) is equiva- 
lent to being symmetric and strongly distribution-free. 

It is known [12] that these two properties are equivalent for statistics in 0* 
w.r.t. 2*. In Section 3 it will be shown that the two properties are equivalent 
for statistics in Q2* w.r.t. 2’, where © satisfies certain closure and completeness 
properties. 

Before proceeding with the proof of this theorem, it is worthwhile to recall 
some definitions and results concerning cpf’s. This is done below in Section 2. 


2. Probability Functions. In view of the nature of the problem, the work will 
deal primarily with probability spaces on the real line and on the unit. interval. 
For that reason the following classes and sets should be defined. 

Let R, R, 1,1, @, &”, Br, and @;"’, be respectively, the real line; euclid- 
ean n-space; the open unit interval; the n-dimensional open unit cube; and the 
respective classes of borel subsets of R, R™, I, I”. 

A epf, F(z), on R is a non-decreasing, upper semi-continuous function defined 
on R and such that lim... F(z) = 1 and lim,—.. F(z) = 0. A epf, H(u), on J 
is & non-decreasing, upper semi-continuous function defined on 7 and such that 
lim.., H(u) = 1 and limy..H(u) = 0. 
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It is well known ([2], p. 96) that each cpf on R induces and is induced by a 
probability distribution on @; similarly each epf on J induces and is induced by 
a probability distribution on @,;. Let ®, denote the probability distribution 
induced by the epf F(x); and let @;" denote power probability distribution on 
the class @'” generated by F, i.e. the probability distribution induced by n in- 
dependent random variables each distributed with epf F. 

If G, G, e 2*, then G, G@", and G,G™ are all 1 — 1 strictly monotone, con- 
tinuous mappings; and, hence, preserve many of the properties of epf’s and their 
probability distributions. In fact, 

(i) if F ¢ QfQ, , Q , Q*)] and G, G, ¢ 2*, then 

(a) FG € 2(1)(2(7), %(1), 2*(1)) and 

(b) FG"G, € MfQ, , M , 2*}. 

Since the closure property (b) is important in the sequel, it is worthwhile to 
give the following formal definition. 

Q is said to be closed under Q if FG"G, ¢ 2, whenever F ¢ 9 and G, G,; ¢ 2. 
Therefore, one conciudes from (i) that Q , 2, , Q and Q* are each closed under 
Q*. 

Further, it is seen that under such mappings numerical values are preserved 
in the following sense. 

(ii) If F ¢Q% and G, G, ¢ Q*, then (a)@¥e-1(B) = PSG" (B) for all B ¢ @;”; 
and (8)@¥¢-1¢,(Gr'(B)] = @F¢-1(B) for all B ¢ @}”, where 


(G(2)] = [G(x), oa ill G(z,)] 
and G'(u, «++ , tn) = [@"(m), «+> , ©" (un)). 


With these preliminary results one can proceed to establish the main theorem. 


3. The Main Theorem. As mentioned in the introduction the object here is 
to demonstrate that for suitable classes of cpf’s a statistic is symmetric and dis- 
tribution-free if and only if it is of structure (d). 

If a statistic, Sg , in 2 w.r.t. 2 is of structure (d), there exists a measurable 
function ® defined on J” and symmetric in its arguments, such that for any 
G eQand FeO, Sa(xz™) = [G(x ) [rp]. 

If A is an arbitrary element of @”, then So'(A) = Gob "(A). In view of 
(ii), then, ®pSo'(A) = 0G "0b"(A) = Prq-1® (A) providing FG” is well 
defined. Clearly, this will be so whenever G ¢Q*. Further, S¢ is symmetric 
whenever ® is. Therefore, one can conclude the following. 

Lemma |: If a statistic, Sg , in Q C Q* w.r.t. D is of structure (d), then Sg is 
symmetric and strongly distribution-free. 

On the other hand if S¢, a statistic in Q C Q* w.r.t. 0’, is symmetric and 
strongly distribution-free, let @; = So, o Gr ' where G; is an arbitrary fixed ele- 
ment of 2 Cc Q*. 

It is clear that 4, is symmetric. Therefore, in order to complete the proof one 
must demonstrate that So(z) = #{G(2“)}[@,] for all F ¢ 0’ and all 


GeQc om. 
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Again let A be an arbitrary fixed element of @”. Then, 
Pp{%,[G(2"” )] e A} = OG ‘9 @7'(A) = Prq-1) (A) = P rg-1Gy 0 So.(A) 
= Pre-1o,Se(A) forall FeO andallG eQC a. 


Now, if FG"G, ¢ &, i.e. if Q’ is closed under 2 C *, then the fact that Sz is 
strongly distribution-free guarantees that P re-10,S0,(A) = @,Sq (A) since 
(FG'G,)Gy' = (F)G". Under these circumstances one sees that 


Pr(d[G(z)] ¢ A} = Pr{Se(x™) € A} for all F eO'",G eQC a. 


These results lead one to the following question. What conditions must the 
class 2’ satisfy in order that Sg and @0G, which have identical distributions 
for each F ¢ 0’, be essentially equal? In answering this question, the following 
definition will be employed. 

A class, 2, of epf’s is said to be symmetrically complete if every unbiased, sym- 
metric estimator of zero, with respect to the class of power probability distribu- 
tions of Q, is essentially zero, i.e., the conditions (1) f is symmetric; and (2) . 
Sam fd ef? = 0 for all F ¢Q, imply that f = O[@F"] for all F e Q. 

In terms of this definition, the answer to the question is as follows. 

Lemma 2: If S and ® are symmetric measurable functions such that 


OF {SecA} = OF {@ e A} 
for all A ¢€ @ and all F € 0; and if 2 is a symmetrically complete class, then 
S = oe," } 


for all F e®. : 
Proor: Let g(B, x”) be the indicator function of B, i.e. 


‘1 for z™” e B, 


ot oe 
nae 1 \o otherwise; 


then for each A ¢ @ and each F ¢’, 


ho [g(S(A), 2) — g(@"(A), 2”)] doy? = ef {S"(A)} 


— of {@"(A)} = 0. 


Since S and @ are symmetric, g(S'(B), 2”) and g(@"(B), z"”) are symmetric, 
and so is their difference. Because of the completeness property of 0’, 


g(S"(B), 2) — g(@"(B), 2) = 0 
and g(S"(B), z) = g(@"(B), cz” )[@F”} for all F ¢ 9’. Consequently, 
oF (S"(A)d®"(A)) = 0 
for all F ¢ Q’ and all A ¢ B. 
[Note: FEAF = (EuF) — (EnP),} 





O'(S ~ &) = of (S >) + 0 (@>8) s 


> > [OFS > (m/k);® < (m/k)) + OF '(S < (m/k); ® > (m/k))] 


m=m—n kel] 


sx > [oF ((s < (m/k))A(# < (m/k)))] = 0 forall Feo. 


Therefore S = [@¥"} for all F ¢ Q. The main theorem now follows immediately. 

Tue Main Tueorem. If Sg is a statistic in 2 w.r.t. 2, then the property of 
being symmetric and strongly distribution-free is equivalent to having structure (d), 
whenever the following three conditions are fulfilled. 

(a) 2C 2; 

(8) Q’ is closed under Q; and 

(y) ® is a symmetrically complete class. 

The next question is: Which classes of statistical interest satisfy the hypotheses 
of the main theorem? 


4. Closed and complete classes. As was previously mentioned one can con- 
clude from (i) that Q , Q; , 2 and Q* are closed under all subsets of 2*. Also, it 
can be proved that Q; , 2, , 2, and Q, do not satisfy that closure property. How- 
ever, one can verify that Q; is closed under Q; n Q*; and that Q, is closed under 
Qn 2*. 

The work of Halmos [16]; Fraser ({14], [15], [1], pp. 23-31); Lehmann ({3], 
p. 132), and Bell-Blackwell-Breiman [8] establish the fact that 2 , 2; , % , Q, 
Q,, 2, and 2, are symmetrically complete. (It should be mentioned here that a 
class of epf’s is symmetrically complete if and only if the order statistic is a 
complete statistic with respect to the class of power probability distributions 
of the given class of cpf’s. ) 

Therefore, 25, 2; , Q , 2; and Q satisfy both the completeness and closure 
hypotheses of the main theorem. Consequently, the following corollary to the 
main theorem is valid. 

Coro.tiary: If Sg is a statistic in 2 w.r.t. 2’, then the property of being sym- 
metric and strongly distribution-free is equivalent to having structure (d) for each 
of the following cases. 

(1) 8c QandQD =; 

(2) 9C OF and D =Q, ; 

(3) 9c OF and = Q ; 

(4) 2 C OF and? = 0; 

(5) 2 = Q;nQ* and = Q; ; and 

(6) Q Q, 9 OF and Q’ Q,. 


REFERENCES 


{1] D. A. 8. Fraser, Non-parametric Methods in Statistics, New York, John Wiley and 
Sons, 1957. 





DISTRIBUTION-FREE STATISTICS 709 


[2] M. Légve, Probability Theory, New York, D. van Nostrand, 1955. 

[3] E. Lewmann, Testing Statistical Hypotheses, New York, John Wiley and Sons, 
1959. 

[4] T. W. ANpprson anv D. A. Darina, ‘Asymptotic theory of certain ‘goodness of fit’ 
criteria based on stochastic processes,’’ Ann. Math. Stat., Vol. 23 (1952), pp. 
183-212. 

[5] C. B. Bex, “Application of distribution-free statistics to some problems in missile 
design and production,”’ Douglas (Aircraft Co.), Santa Monica Report SM 
18396 (1954). 

[6] C. B. Brut, ‘On the structure of algebras and homomorphisms,’’ Proc. Amer. Math. 
Soc., Vol. 7 (1956), pp. 483-492. 

[7] C. B. Brut, ‘On the structure of stochastic independence,’’ Jl. J. Math., Vol. 2 (1958), 
pp. 415-424. 

{8} C. B. Bett, D. Buackwe.t anv L. Breiman, “A note on the completeness of order 
statistics,’’Ann. Math. Stat., Vol. 31 (1960), pp. 794-797. 

{9} Z. W. Brrnpaum, “‘Numerical tabulation of the distribution of Kolmogoroff’s statistic 
for finite sample size,’’ J. Amer. Stat. Assoc., Vol. 47 (1952), pp. 425-441. 

[10] Z. W. Brnnspaum anv F. H. Tinoey, “‘One-sided confidence contours for probability 
distribution functions,’’ Ann. Math. Stat., Vol. 22 (1951), pp. 592-596. 

[11] Z. W. Brrnsaum, ‘‘Distribution-free tests for continuous distribution functions,’’ 
Ann. Math. Stat, Vol. 24 (1953), pp. 1-8. 

[12] Z. W. Brrnspaum anv H. Rustin, “On distribution-free statisties,"’ Ann. Math. Stat., 
Vol. 25 (1954), pp. 593-598. 

[13] W. Fevver, ‘‘On the Kolmogoroff-Smirnoff limit theorems for empirical distributions,”’ 
Ann. Math. Stat., Vol. 19 (1948), pp. 177-189. 

[14] D. A. 8. Fraser, ‘Completeness of order statistics,’’ Can. J. Math., Vol. 6 (1953), pp. 
42-45. 

[15] D. A. 8. Fraser, “Non-parametric theory: scale and location parameters,’ Can. J. 
Math., Vol. 6 (1953), pp. 46-68. 

[16] P. R. Hatmos, “The theory of unbiased estimation,’’ Ann. Math. Stat., Vol. 17 (1946), 
pp. 34-43. 

[17] B. F. Kimpa.u, “Some basic theorems for developing tests of fit for the case of the 
non-parametric probability distribution function, I,’’ Ann. Math. Stat., Vol. 18 
(1947), pp. 540-548. 

\18}] H. Scuerré, “On a measure problem arising in the theory of non-parametric tests,’’ 
Ann. Math. Stat., Vol. 14 (1943), pp. 227-233. 

[19] B. Suerman, “A samdem variable related to the spacing of comple values,’’ Ann. 
Math. Stat., Vol. 21 (1950), pp. 339-361. 

[20] N. Smirnov, “Table for estimating the goodness of fit of empirical distributions,” 
Ann. Math. Stat., Vol. 19 (1948), pp. 279-281. 

[21] A. WaLp anv J. Wousowrrs, “Confidence limits for continuous distribution fune- 
tions,”’ Ann. Math. Stat., Vol. 10 (1939), pp. 105-118. 





SMALL SAMPLE DISTRIBUTIONS FOR MULTI-SAMPLE 
STATISTICS OF THE SMIRNOV TYPE 


By Z. W. Birnspaum anv R. A. Hat 


University of Washington 
1. Introduction and Summary. Let 


(1.1) SP x0. 3.22. ¢ = 1,2,---,¢, 


be samples of ¢ independent random variables X“” with continuous cumulative 
distribution functions F, and let 


Fe (xr) = 0 x < x{” 


F*°(z) = 1 XM sc 


(1.2) F*° (2) = k/n; Xf’ s2< XP,,1S5k <n; 


be the corresponding c empirical distribution functions. We define the statistics 


(1.3) D(m,M2,°*+,%) = sup | F*( 7) — F*P(z) | 
Z.t.72 
(i,j=1,2,-+-,e) 


and 


(1.4) D*(m,,°**,M%) = sup [Fe (r) — F*?(z)). 
(6<5 36,5018, +0) 
The well known Kolmogorov-Smirnov statistics D(m, n) and D*(m, n) are 
special cases of (1.3) and (1.4), respectively, with c = 2,m, = m, mn. = n. 
The exact small sample distribution, under the null hypothesis 


(1.5) F® = F” for all i,j = 1,2, ---,¢, 


of the statistics defined by (1.3) and (1.4) for any number c of samples, and for 
any sample sizes nm, , m2, --- ,-, can be obtained by solving simple difference 
equations which lend themselves to programming for machine computation. 
Using this procedure, tables of values of 


P(D(n, n,n) S r), P(D(n, n) S 1), P{[D* (n,n) s 
were computed for selected values of n between 1 and 40 and of r = 
2, °°° Nn. 

Furthermore, the inequalities 


P([D(n, n, ---,n) Sr] 2 1 — [e(e — 1)/2|P[D(n, n) > 1] 


P(D(n, n, ++: ,n) Sr] 2 1 — [e(e — 1) (ce — 2)/6)P[D(n, n, n) > rj 
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are noted, which may be useful for values of ¢ 2 4 for which tables are not 
available. 


2. The general difference equations. The set of all possible values of the 
c-dimensional random variable [F*?(x), F*® (x), --- , F*(zx)], for fixed z, 
consists of the points 


(2.1) (ki /m, , ke/ne, +++ he/ne) k; = 1,2, +-+mg3 4 


of the c dimensional unit cube. 

By the transformation y; = n, x; the c dimensional unit cube is transformed 
into the c-dimensional rectangular prism with sides n; ,n. , --* n, , and the points 
(2.1) are transformed into the points 


(2.2) (ki, ka, +>, he) k= 1,2,--+m3 = 1,2,--,0, 


Under the null hypothesis (1.5) the c samples may be considered as c succes- 
sive drawings of n;, m2, --+ nm. observations from the same population, with 
equal probabilities of each of the N! ways of drawing the ordered sample of size 
N, where 


(2.4) N=mt+mte++ +n. 


The points (2.2) may be interpreted as being obtained in the following man- 
ner: the sample values X,", k = 1, 2, ---,;;i = 1, 2, ---, ¢, are observed 
and plotted on the z-axis. It is agreed that, as one moves along the z-axis from 
— ~ to + ~, the coordinate k, of the point (2.2) is increased by a unit whenever 


a value X{” is crossed. By this procedure one obtains a path through points of 
the form (2.2), starting at (0, 0, --- , 0) and ending at (m,m, --- ,m.), and 
each set of the c samples determines such a path. Under the null hypothesis (1.5) 
all these paths are equally probable, and their number is clearly 


(2.5) Q(m , M2, °** , Me) = N1/(m!ne! +++ 1). 
We define, generally, 
Q(ki , ke, +++ , ke) 
= number of paths from (0,0, --- ,0) to (ki, ke, --+ , he) 


(2.6) 
for any non-negative integers k; , ke, --- , k.. The function Q satisfies the differ- 
ence equation 
Q(ki , ke, 7 » We) = Qk ye 1, ke, nee » Me) 
+ Q(ki , ke — l, a » ke) 


+ Q(ki, ke, +++, ke — 1) 


since the number of ways of getting from (0, 0, --- ,0) to (ki, ke, --- , ke) is 
evidently the sum of the numbers of ways of getting to points from which 
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(ki , ke, «++ , ke) can be reached in one step. We have for Q the initial condition 
(2.8) Q(0,0,---,0) = 1. 


To compute all values of Q forO0 S ki S nj,1 = 1, 2, --+ , ¢, one may start 
with (2.8) and use (2.7) recursively, a procedure which can be programmed for 
an electronic computer. 

Let now R be a given set of points of the form (2.2), and let 
(2.9) Q(ki, ke, -*-, ke; R) = number of paths from (0, 0, --- 6) to 

(ki, ka, «++, ke) which do not pass through any points in R. 


Again the difference equation 
Q(ki, ke, eae »k.; R) = Q(ky — 1, ke, 7 » Re; B) 


+ Q(ki , ke — 1,-*+,k;R) 
(2.10) 


+ Q(ki, ks, +++ ke — 1; R) 


is satisfied, and can be solved recursively under condition (2.8) and the addi- 
tional conditions 


(2.10.1) Q(ki,ke,++:,ke;R) =O for (hy, ke, +++, ke) mR. 


This, again, is an algorithm which can be programmed for an electronic com- 

puter but the program must now, among others, contain the instruction for the 

computer to decide at each point (k; , ke, --- , k-) whether it belongs to FR or not. 
We now define 


(2.11) Pr(m,™, ree, Me) _ Q(m, nm, -++ me; R)/Q(m, ne, oo* Me), 


the probability that, under the null hypothesis (1.5), the samples determine a 
path from (0, 0, --- , 0) to (m, me, --- , m-) which does not pass through any 
point of R. 

If, for a given set R, we agree to reject the hypothesis (1.5) whenever the sam- 
ples determine a path containing points in R, then 1 — Ps is the probability of an 
error of the first kind, i.e. of rejecting the hypothesis when it is true. The tabula- 
tion of Pz is manageable for reasonable numbers of samples c and sample sizes 
M,N, °**, m, and for R such that one can program for the computer a rule 
for deciding whether a point is in R or not. 

For the Kolmogorov-Smirnov statistic D(n, , n2) the sets R are usually defined 
by D(m , m) > r, which is equivalent with 


(2.12) R,: | neki — nke| > nya, 


and for the one-sided statistic D*(n; , nz) by D*(m: ,n2) > r, equivalent with 


(2.12.1) Ri: neki — mike > nymer. 
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For n; = ne, (2.12) and (2.12.1) become | k, — ky| > mr and ki — ky > nr, 
respectively. 

Analogous multi-sample tests can be defined by using the statistics (1.3) or 
(1.4) and the regions of rejection 


Dim ,m,-*:,%) >r and D*(m,m,+-:,m) >, 
respectively. The corresponding sets FR are 


(2.13) Sup |njk; — nk;\| > 1, 
) 


(igml.-->,e 
and 
(2.13.1) Sup (njki — nikj) > 
(é<3) 
respectively. 
It may be noted that the computations involved in tabulating 


Pr(m eee A" 2 Ne) 


would not be much more difficult to program and more time-consuming if (2.13) 
or (2.13.1) were replaced by more general sets R such as 


| njky — nikj| > f(ki, he, «++ , he) 


for some reasonably simple function f. 

The tables described in the next section were computed by using difference 
equations (2.7) and (2.10). It should be stated that these difference equations 
have been well known and used for the case c = 2, and that closed expressions 
for P»(m , m2) were obtained in special cases, e.g. by Gnedenko and Korolyuk 
[3] and by Drion [2]. An excellent summary of the history of these methods may 
be found in the paper by Hodges [4]. A more recent paper by David [1] contains 
the derivation of the small-sample distribution and the asymptotic distribution 
of the statistic 


Max {sup [F*” (x) — F*(z)], — sup [F* (x) — F*®(z))], 
(2) (2) 


sup (F* (x) — F*®(x)}}. 
(2) 


3. Tables. Table 1 contains the probabilities P|D(n, n,n) S r| forn = 1 
(1) 20 (2) 40 and consecutive integer values nr such that the probabilities for 
each n range from less than .90 to more than .995. 

Table 2 contains the probabilities P[D(n, n) S r) for n = 1 (1) 40 and 
nr = 1 (1) min (n, 20). 

Table 3 contains the probabilities P[D"(n, n) S r| for n = 1 (1) 40 and 
nr = 1 (1) min (n, 20). 

All probabilities are given to six decimal places. Conservative error estimates 
assure an error <5.10~° throughout Table 1 and error <(2.3)10~° throughout 
Tables 2 and 3, but the actual errors are likely to be much smaller. 
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TABLE 1 
P(D(n, n,n) Sr} 





| 1 000000 | ‘9 400000 | 0 128571 | 0 037402 | 0 010275 | 0 002719 | 0 000701 | 0 000177 | 0 000044 | 0 000010 
1 000000 | 0 771428 | 0 539220 | 0 355929 | 0 226374 | 0 140271 | 0 085256 0 051053 | | 0 030213 
| 1 000000 | | 0 926406 | 0 811093 | 0 684084 | 0 562086 | | 0 453012 | 0 350715 | 0 282279 

1 000000 | 0 978188 | 0 932164 | 0 868227 | 0 793917 | 0 715417 | 0 637148 

| 1 000000 | _ 0 993829 | 0 977501 | 0 950288 | 0 913501 | 0 869301 


1 000000 | 0 998303 | 0 992915 | 0 982475 | 0 966446 
| 1 000000 0 999541 | 0 997847 | 0 994114 

| 1 000000 | 0 999877 | 0 999362 

| 1.000000 | 0 999967 


| 1 000000 


0 000002 2 | 0 0 000000 | 0 000000 

0 017709 | 0 010297 | 0 065948 | 
0 219397 | 0 169169 | 0 129569 
0 562027 0 491832 | 0 427525 
0 819975 | 0 767590 | 0 713862 


0 944960 | 0 918575 | 0 888073 | 0 854312 | 0 818130 | 0 780302 | | 0 741807 | 0 voaaas | 0 e6szso | 0 624670 

0 987711 | 0 978261 0 949882 | 0 93122 | 0 909960 | 0 886458 | 0 861064 | 0 834155 | 0 806081 
0 998003 | 0 995601 0 986249 | 0.978802 | 0 960415 | 0 958006 0.944016 | 0 920088 | 0 913459 
0 999815 | 0 99309 0 997044 | 0 994729 | 0 991436 | 0 987039 | 0 981452 | 0 974624 | 0 966539 
0 999991 | 0 999947 0 999518 | 0 998964 | 0 998049 | | 0 996668 | | 0994723 | 0 992126 | 0 988805 
0 999997 0 999943 | 0 999844 Lidia lads | ial didi leben 73 
1 000000 | 0 ¢ 0 999995 | 0 999983 | 0 999950 0 999881 | 0 990754 | 0 999539 | 0 990205 
0 999999 | 0 999998 | 0 999994 0 990084 | 0 990961 | 0 999015 0 999834 
1 000000 | 0 999999 | 0 999999 | 0 999998 | 0 999995 | 0 999987 | 0 999971 
| 1 000000 | 0 999999 | 0 999999 | 0 999999 | 0 999998 | 0 999996 


22 28 30 — 1. 2 


0 876276 | 0 834896 | 0 790312 | 0 744128 | 0 697257 
0 946679 | 0 922382 | 0 893835 | 0 862177 | 0 828007 | 0 792099 | 0 754883 0 717132 | 0 679257 | 0 641658 





0 979784 | 0 967557 | 0 951678 | 0 932761 0 910963 | 0 886657 | 0 860253 | 0 832162 | 0 802779 | 0 772473 
11 | 0 993268 | 0 987984 0 980202 | 0 970227 | | 0 957886 | 0 943250 | | 0 926465 | 0 907727 | 0 887261 | | 0 865309 
0 998039 | 0 996114 | 0 992702 | 0 988029 | 0 981779 | 0 973852 | 0 964215 | 0 952885 | 0 939929 | 0 925445 
0 999504 | 0 998965 | 0 997584 | 0 995632 | 0 992789 | 0 988907 | 0 983879 | 0 977629 | | 0 970120 | 0 961345 
0 999892 | 0 999844 | 0 999284 | 0 998556 | 0 997392 | 0 995668 | 0 993276 | | 0 990117 0 986113 | 0 981208 


0.999980 | 0 999928 | 0 999811 0 999569 | 0 999139 | 0 998444 | 0 997404 | | 0 995937 | | 0 993968 | 0 991430 


| 











0 999741 0 999486 | 0 999073 | 0 998447 | 0 997552 | 0 996333 





| © 002903 | 0 001514 | 0 000787 | 0 000408 


0 167412 
0 520849 
0 788523 
0 925339 


0 979260 
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TABLE 2 
P([D(n, n) & 1) 


+ 1.000000 | 0 666666 | | 0 400000 | 0 228571 | 0 126086 | 4 | 0 000264 | 0 087296 


“1000000 | 0 900000 0 771428 | 0 642857 | 0 525074 | 0 424825 
| 1.000000 | 0 971428 | | 0 920634 
| 1 000000 | 0 992063 | 0 974025 | 0 946960 


| 
i 
| 
| 
| 
| 


' 1 000000 





| 4 
“a 


0 131018 | 0 102194 
0 463902 | 0 411803 
0 744224 | 0 700079 


0 — | 0 rr 


0 seem | scart 


0 000211 
0 079484 | 0 061668 
0 364515 | 0 321861 
0 656679 | 0 614453 
0 845065 | 0 815583 


0 940970 | 0 924535 


0 857142 | 0 787878 


| 0 997835 | 0 991841 
se 


| 


0 999417 
1 000000 


| 
| | 
i 


lo 000109 
0 047743 


0 000056 
0 036892 
0 283588 | 0 249392 
0 573706 | 0 534647 
0 785465 | 0 755040 | 


0 906673 | 0 887622 


0 019891 | 
0 339860 
0 717327 
0 912975 | 
0 981351 


0 997513 
0 999844 
1 000000 


0 000028 
0 028460 
0 218952 
0 497409 
0 724581 


0 867606 


1 | 0 010530 0 005542 
0 269888 | 0 213070 
0 648292 | 0 582476 
| 0 874125 | 0 832178 
| 0 966433 | 0 947552 


| 0 993708 | 0 987680 
| 0 999259 | 0 997943 
| 0 999958 | 0 999783 
| # 000000 | 0 999989 


0 000014 | 0 

0 021922 | 0 016863 
| 0 191938 | 0 168030 
0 462071 | 0 428664 
0 694310 | 0 664409 


| 
0 846826 | 0 825466 


0 992140 oar |o wa 0 973751 | 0 965002 | 0 955047 
0 998503 | 0 997125 | 0 995100 | 0 992344 | 0 988800 | 0 984439 
0 999795 | 0 999500 0 998979 | 0 998162 | 0 996984 | 0 995380 
0 conees | © sense7 0 999836 | 0 999646 | 0 999329 | 0 998847 


0 995633 
0 999345 
0 999937 
0 999997 


0 943981 | 0 931910 | 0 918942 
0 979252 | 0 973250 | 0 966458 
0 993331 | 0 990776 | 0 987701 
0 998160 | 0 997232 | 0 996032 


0 900670 | 0 conaas 0 998884 
oeuns oem oceerae 
0 999971 


1 000000 | 0 999999 | 0 999995 


1 000000 | 0 999999 


0 999981 | 0 999947 | 0 999880 
0 999998 | 0 999994 | 0 999983 
1 000000 | 0 999999 | 0 999998 
| 1 000000 | 1 000000 | 0 999990 | 


0 999761 
0 999960 
0 999994 


0 990099 0 ons 0 om 0 van 


1 000000 
1 000000 
nna 


0 999999 | 0 999990 
1 000000 | 1 000000 
1 000000 | 1 000000 
1 000000 | 1 000000 

| 1 000000 


>| 
0 000000 | 0 000000 | 0 000000 | 0 000000 | 0 000000 0 00000 | 0 00000 
0 012955 | 0 009942 | 0 007622 | 0 005838 | 0 004468 | 0 003417 | © 002611 | 0 001993 | 0 001521 | 0 001160 
© 146921 | 0 128321 | 0 111963 | 0 097599 | 0 085006 | | 0.073080 | 0 064337 | 0 056014 0 048563 | 0 042153 
4 | 0 397187 | 0 367613 | 0 339899 | 0 313982 | 0 289796 | 0 267262 | 0 246302 | | 0 226833 | 0 208772 | 0 192036 
5 | | 0 635020 | 0 606260 0 578218 0 550963 | 0 524546 | 0 499004 | 0 474362 0 450633 | 0 427822 0 405929 


0 000003 | 0 000001 | 0 000001 
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21 22 


6 | 0 803687 0781631 0 759421 


7 | 0905183 | 0 890738 | 0 875705 | 


8 | 0.958911 | 0 950653 | 0 941731 
9 | 0984094 0 979952 | 0 975279 
10 | 0 994532 0 992710 | 0 990548 


| | 
11 | 0.998343 | 0 997641 | 0 996759 


12 | 0.999561 | 0 999326 | 0 999009 | 0 998598 | 0 998079 | 0 997439 


13 | 0.999899 | 


0 999996 | 0 999993 | 0 999986 


0 999999 | 0 999998 | 


1 000000 | 0 999999 

1.000000 | 1 000000 

1 000000 | 1 000000 

1.000000 | 1 000000 

31 | 32 

oGreeianons 

0 000000 | © 000000 

0 000884 | 0 000674 | 0 000513 

0 036570 | 0 031710 | 0 027482 
0 176546 | 0 162222 | 0 148989 
0 384946 | 0 364860 | 0 345656 
0 586454 
0 743954 
0 852579 
0 920879 


0 960438 


0 566263 
0 726991 
0 839930 
0 912317 
0 955137 


0 546505 
0 710076 
0 827085 
0 903453 
0 949530 


0 981599 
0 992054 
0 996821 
14 | 0 998825 
15 | 0.999600 
16 | 0 999875 


0 975325 
0 988735 
0 995206 
0 998102 
0 999466 0 999302 


| 0 999825 | 0 999762 


TABLE 2—(Continued) 





| 0 737166 | 0 714957 
0 860177 | 0 844239 
0 932196 | 0 922101 
0 970086 | 0 964388 
| 0 988034 | 0 985162 


| 0.995679 | 0 994385 


| 0 999831 | 0 999732 | 0 999594 | 0 999409 | 0 999167 
0 999980 | 0 999963 | 0 999936 | 0 999895 | 0 999837 | 0 999756 


0.999976 0 999960 
0 999995 | 0 999991 
0 999999 | 0 999998 
0 999999 | 0 999999 
1 000000 | 0 999999 
1 000000 | 1 000000 
aid x 


M 35 





0 000000 | 0 000000 
0 000390 | 0 000297 
0 023808 | 0 020615 
0 136773 | 0 125505 
0 327315 | 0 309815 


0 527197 | 0 508355 
0 693241 | 0 676518 
0 814080 | 0 800946 
0 894313 | 0 884922 
0 943629 | 0 937451 


0 971814 
0 986806 
0 994228 | 0 993128 
0 997644 | 0 997113 
0 999104 0 998868 


0 968060 
0 984695 


| 0 999683 | 0 999586 


27 28 29 30 


| 0 692876 | 0 670992 | 0 649361 0 628035 | 0 607054 
0 827971 | 0 811443 | 0 794721 | 0777865 | 0 760026 
0 911498 | 0 900437 | 0 888969 | 0 877140 | 0 864996 
0 958206 | 0 951561 | 0 944480 | 0 936988 | 0 929112 
0 981927 | 0 978330 | 0 974375 0 970069 | 0 965419 


| 0991109 | 0 989109 | 0 986859 | 0 984356 
| 0 996666 | 0 995750 | 0 994681 0 993451 
0 998861 | 0 998482 | 0 998020 | 0 997469 
0 999647 | 0 999505 0 999325 | 0 999100 


0 999901 | 0 999853 | 0 999790 | 0 999706 


| 0 992865 


| 0 999085 | 0 999975 | 0 999961 | 0 999940 | 0 999012 
| 0.999996 | 0 999994 | 0 999990 | 0 999984 | 0 999976 
| 0 999999 | 0 999998 | 0 999998 | 0 999996 | 0 999994 
| 0.999990 | 0 999999 | 0 999999 | 0 999990 | 0 999908 
| 1.000000 | 0 999999 | 0 999099 0 999999 | 0 999990 
|--— j - 


| 6 o<ukas gt 

0 000000 | 0 000000 | 0 000000 | 
0 000226 | 0 000171 | 0 000130 
0 017844 | 0 015440 | 0 013354 
0 115119 | 0 105553 | 0 096746 
6 293133 | 0 277243 | 0 262120 


38 


———| 


3” 


" — 

j 40 
i sata 
0 000000 
0 000075 
0 009980 
0 081194 


0 247737 | 0 234068 


0 489989 | 0 472106 
0 659934 | 0 643511 
0 787713 | 0 774409 
0 875305 | 0 865485 
0 931011 | 0 924322 


| 0 454713 | 0 437810 
0 627272 | 0 611234 
0 761059 | 0 747686 
0 855485 | 0 845325 
0 917402 | 0 910264 

i 


0 421399 
0 595412 
0 734312 
0 835027 
0 902925 


0 964067 | 0 959843 
0 982400 | 0 979921 | 0 977260 | 0 974418 | 0 971396 
0 991904 | 0 990551 0 989067 | 0 987450 | 0 985608 
0 996507 | 0 995820 | 0 995049 | 0 994189 | 0 993239 
eee 0 998265 | 0 997891 | 0 997464 | 0 996981 


| 0 999467 0 999325 | 0 999156 | 0 998958 | 0 998729 


0 955395 | 0 950731 | 0 945858 


17 | 0 999964 | 0 999947 | 0 999025 | 0 999896 | 0 999859 | 0 999812 | 0 999754 | 0 999683 | 0 999598 0 999496 
18 | 0 999900 | 0 999985 | 0 999978 | 0 999968 | 0 999955 | 0.990938 0 999916 | 0 999888 | 0 999854 | 0 999812 
19 | 0 999907 | 0 999996 | 0 999994 | 0 999991 | 0 999987 | 0. 999981 0 999973 | 0 999963 | 0 999950 | 0 999934 


20 | 0 999999 | 0 999999 | 0 999998 | 0 999997 0 999996 | 0 999994 | 0 999992 0 999988 | 0 999984 | 0 999978 
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TABLE 3 
Pi [D*(n, 7”) 3 r] 


10 
' 


| 333333 | 250000 | 200000 | 1065 142857 | 125000} 111111 100000 90909 

| 

"1 000000 | 0 833333 | 0 700000 | 0 600000 | 0 season | 0 aoaaes| 0 416666 | 0 377777 | 0 345454 | 0 318181 
| 1 000000 | 0 950000 | 0 885714 | 0 821428 | 0 761904 | 0 708333 | 0 660606 | 0 618181 | 0 580419 

| | 1 000000 | 0 985714 | 0 960317 | 0 928571 | 0 893939 | 0 858585 | 0 823776 | 0 790200 

| 1 000000 | 0 996031 | 0 987012 | 0 973484 | 0 956487 | 0 937062 | 0 916083 


500000 


| 1 000000 | 0 998917 | 0 995920 | 0 990675 0 983216 | 0 973776 
| i | | 

| 1 000000 0 999708 0 998756 | 0 996853 | 0 993820 

1 000000 | 0 999922 | 0 999629 | 0 998971 

1 000000 | 0 999979 | 0 999891 

| 1 000000 | 0 999094 

Te ' 1.000000 

& 


| 
| 
| 
| 
_— 2 Sie 


17 | 20 


— =: aii 


70923 | oe 71428 — 62500 58823 55555 52631 50000 47619 


0 274725 | 0 257142 | 0 241666 | 0 227941 0 204678 | 0 194736 | 0 185714 | 0 177489 
0 516483 | 0 489285 | 0 464705 | 0 442401 0 403508 | 0 386466 | 0 370779 | 0 356295 
0 723021 | 0 699579 | 0 672875 | 0 647832 0 602339 | 0 581681 | 0 562281 | 0 544042 
0 872010 | 0 849789 | 0 827829 | 0 806308 0 765018 | 0 745371 | 0 726425 | 0 708187 
0 950226 | 0 936753 | 0 922523 | 0 907765 0 877401 | 0 862076 | 0 846798 | 0 831646 
i i 

0 984281 | 0 977863 0 970485 | 0 962267 0 943808 | 0 933796 0 923390 | 0 912705 
0 996070 | 0 993675 | 0 990608 | 0 986875 0 977523 | 0 971990 | 0 965955 | 0 959470 
0 999251 | 0 998562 | 0 997550 | 0 996172 0 992219 | 0 989626 | 0 986625 | 0 983229 
0 999897 | 0 999750 | 0 999489 | 0 999081 0 997694 | 0 996665 | 0 995388 | 0 993850 
0 999991 | 0 999968 | 0 999918 | 0 999823 0 999423 | 0 999080 0 998616 | 0 998016 
0 999997 | 0 999990 0 999973 0 999880 | 0 999785 0 999642 | 0 999442 

0 999999 | 0 999999 0 999997 0 999980 | 0 999958 | 0 999921 | 0 999864 

1 000000 | 1 000000 0 999999 | 0 999999 | 0 999997 | 0 999993 | 0 999985 | 0 999972 

| 1 000000 | 1 000000 | 1 000000 | 0 999999 | 0 999999 | 0 999997 | 0 999905 

1 000000 | 1 000000 | 1 000000 © seesee | © genes 0 999999 


1 000000 | 1 000000 1 000000 | 1.000000 | 0 999999 
| 1 000000 | 1 000000 | 1 000000 | 1 000000 

| 1000000 | 1 000000 | 1 000000 

| 1 000000 | 1 000000 

1 000000 





TABLE 3—(Continued) 


nr | 7 
' 
22 23 


4166 | 


27 


21 
0| 45454 


26 


37037 


40000 | 38461 


43478 | 34482| 33333 


35714 | 32258 


0 169960 | 0 163043 | 0 156666 | 0 150769 | 0 145299 0 140211 0 xa54e7 | 0131004 | 0 2nsst| 012208 
0 342885 | 0 330434 | 0 318846 | 0 308034 | 0 297924 | 0 288451 | 0 279556 | 0 271190 | 0 263306 

0 526877 | 0 510702 | 0 495441 | 0 481025 | 0 467390 | 0 454479 | 0 442237 | 0 430617 | 0 419574 

0 690650 | 0 673801 0 657621 | 0 642086 | 0 627173 0 612856 | 0 599108 | 0 585903 0 573216 
0 816681 | 0 801950 | 0 787488 | 0 773321 | 0 759466 Le rn toe 0 707348 
0.901793 0 890731 
0 952590 | 0 945365 
0 979455 | 0 975326 | 0 970865 | 
0 992047 | 0 989976 | 0 987639 | 
0 997266 | 0 996355 | 0 995274 


0 879577 | 


0 857183 | 0 846022 | 0 834926 | 0 823922 | 0 813028 
| 0.937846 | 


0 868380 

0 930077 | 0 922100 | 0 913953 | 0 905672 | 0 897287 | 0 888826 

0 966097 | 0 961050 | 0 955747 | 0 950216 | 0 944479 | 0 938562 | 0 932486 
0 985043 | 0 982194 | 0 979103 | 0 975780 | 0 972239 0 968493 | 0 964555 
0 994017 | 0 992581 | 0 990963 | 0 989165 | 0 987187 | 0 985034 | 0 982709 
| 90454 | 0 ong4zo 0 992178 
0 998333 | 0 997875 0 997340 | 0 996725 
0 999704 | 0 999583 | 0 999430 | 0 999241 | 0 999010 | 0 998734 
0 999918 0 999878 | 0 999823 | 0 999752 0 999662 | 0 999550 
| 0 999980 | 0 999968 | 0 999950 i 0 999853 


| 
0 997192 | 0 996432 | 0 995554 
0 999039 | 0 998719 | 


0 999171 | 0 998820 
0 999780 | 0 999663 
0 999949 | 0 999915 
0 999990 | 0 999981 
0 999998 | 0 999996 


0 998379 
0 999504 | 
0 999866 | 
0 999968 
0 999993 


0 997839 
0 999299 
0 999797 
0 999948 
0 999988 








| 
| 0 999990 0 999998 | 0 999997 | 0 999995 0 ooeova | 0 ooeee7 
1 000000 | 0 999999 | 0 999999 | 0 999999 0 999998 | 0 999997 
1 000000 | 1 000000 | 1 000000 | 0 999999 | 0 999999 | 0 999999 
1 000000 | 1 000000 | 1 000000 | 1 000000 | 1 000000 | 0 999999 
1 000000 | 1 000000 | 1 000009 | 1 000000 | 1 000000 | 0 999999 


32 33 x 35 


27777 


0 999999 
1 000000 
1 000000 | 
1 000000 | 
1 000000 


3 


31250 | 


0 999980 | 0 999970 
0 999995 | 0 999992 
0 999998 | 0 999998 
0 999999 | 0 999999 


0 999956 
0 999988 
0 999997 
0 999999 
0 999999 








% 


27027 


37 


26315 


30303 29411 8571 25641 24999 | 


i 
0 098717 | 0 096341 | 0 094076 
0 208630 | 0 203919 | 6 199416 


0 119318 | 0 115864 | 0 112605 


0 248830 | 0 242169 | 0 235854 


0 109523 | 0 106606 
0 229858 | 0 224158 


0 103840 
0 218732 


0 101214 
0 213562 





| 


| 0 999937 





0 399064 | 0 389525 | 0 380422 
0 549298 | 0 538019 | 0 527164 
0 683290 | 0 671750 | 0 660528 


| 
0 791638 | 0 781167 | 0 770856 | 
0 871777 | 0 863229 | 0 854689 
0 926272 | 0 919939 | 0 913505 | 
0 960438 | 0 956157 | 0 951724 | 
0 980219 | 0 977568 | 0 974764 


| 
0 990799 | 0 989294 | 0 987662 
0 996027 | 0 995241 | 0 994367 
0 998410 | 0 998034 | 0 997603 | 
0 999412 | 0 999247 | 0 999051 | 
0 999800 | 0 999733 | 0 999651 | 


| 0 999912 | 0 999881 
| 0 999973 0 999962 
0 999992 0 999989 
0 999998 | 0 999997 
0 999999 0 999999 


0 999982 
0 999995 
0 999998 


0 371726 | 0 363411 
0 516712 | 0 506644 
0 649616 | 0 639007 


0 760713 | 0 750743 
0 846173 | 0 837693 
0 906988 | 0 900402 
0 947152 


0 985907 | 
0 993403 | 0 992347 
0 997113 | 0 996564 
0 998821 | 0 998556 
0 999552 | 0 999434 


0 999841 | 0 999793 
0 999948 


| 0 999929 
| 0 999977 
0 999993 
0 999998 


0 999984 
0 999995 
0 999998 


718 


0 355454 | 0 347832 
0 496940 | 0 487582 
0 628693 | 0 618666 


0 731333 
0 820888 
0 887081 


0 74000 | 
cet 
| 0 893763 
0 942454 0 937643 0 932729 0 927724 | 
0 971814 | 0 968725 | 0 965504 | 0 962160 | 0 958699 | 


0 984029 | 0 982033 | 0 979921 | 0 977697 


| 0 991200 | 0 989960 
0 995952 | 0 995275 
0 998253 | 0 997910 

lo 999294 | 0 999132 


| 0 999733 | 0 999662 
0.999906 | 0 999877 
0 999969 | 0 999958 


0 999997 0 999996 


0 340525 | 0 333514 | 0 326782 
0 478554 | 0 469840 | 0 461425 
| 0 608016 | 0 590435 | 0 590215 
| 0 721895 
0 812582 
0 830371 


0 712638 | 0 703559 
0 804349 | 0 796197 
| 0 873642 | 0 866904 
0 922638 | 0 917480 
0 955130 | 0 951459 
i 
0 975365 | 0 972929 
0 987209 | 0 985698 
0 993725 | 0 992849 
0 997094 | 0 996619 
0 998732 | 0 998490 


| 0 988630 
| 0 994533 | 
| 0 997524 

0 998045 | 


_ 0 999578 | 0 999479 | 0 999364 
0 999841 | 0 999799 | 0 999748 
0.999944 | 0 999927 0 999905 
0 999981 | 0 999975 0 999967 
0 999994 | 0 999992 0 999989 
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Table 2 is an extension of the table given by Massey [5]. Tables 1 and 3 appear 
to be new. Table 3 could also have been computed by a method due to Drion [2]. 

The computations were programmed for and carried out on the IBM 650 of 
the Research Computer Laboratory of the University of Washington. The authors 
wish to express their sincere appreciation to Professor D. B. Dekker for his gen- 
erous help in planning and performing these computations. 


4. Case of c > 3. With increasing number of samples c, the computations are 
not more complicated in structure but quickly become prohibitive in view of the 
increasing demand on the storage capacity of the computer and the number of 
additions required. Tabulations similar to those presented in the preceding sec- 
tion, while feasible, would hardly be worth the effort for many values of ¢ > 3. 
Should exact tests based on the statistics D(m; , m2, «-~- , Me), D*(m, me, «++ , Me) 
be practically needed then, instead of computing tables, it may be preferable to 
prepare a program for an electronic computer which, for given sample values, 
would calculate the single probability needed in every specific case. 

Lacking such a program, one may for c 2 3 make use of the following simple 
inequalities. 

One clearly has, for c 2 3, 


P[D(m , ne, *-* 5%) Sr] = P[ Max sup| F*(z) — F*”(z) | < rj 
lgi<jae = 


= 1 — P{sup| F* (x) — F*(z) | > r for some i < 3] 


21-—- LD PW(n,n;) > 1 


lsi<jge 
and, form = m= --- = n,.,¢ 2 3, 


(4.1) P[D(n,n,---,n) Sr] 21 — [ee — 1)/2|P(D(n, n) > +}. 


For c 2 4, one similarly obtains 
(4.2) P[D(n,n,---,n) Sr] 21 — [ele — 1)(c — 2)/6)P[D(n, n, n) > rj. 


These inequalities make it possible to use the statistic D(n, n, --- , n) for test- 
ing the hypothesis (1.5) using only Table 1 or Table 2, whichever yields a greater 
value for the right side of (4.2) or (4.1), respectively. The test will be conserva- 
tive, i.e. the probability of error of the first kind is less than that obtained from 
(4.2) or (4.1), but for the conventional “significance levels” and c not too large 
the right sides in both inequalities should be close approximations to the left 
side. 

Similar inequalities are easily obtained for the statistic D*(n, n, --- , n). 

It has been pointed out quite strikingly by Hodges [4] that, for c = 2, asymp- 
totic expressions such as that due to Smirnov [6, 7] are inaccurate even for fairly 
large values of n, to an extent which makes it inadvisable to use them. It appears, 
therefore, rather doubtful that good approximations can be found for c > 3, 
and as long as such approximations are not available inequalities of the kind of 
(4.1) or (4.2) may be of practical use. 
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POLYA TYPE DISTRIBUTIONS OF CONVOLUTIONS' 


By Samvet Karin anp Frank ProscHan 
Stanford University 

1. Introduction. The theory of totally positive kernels and Pélya type distribu- 
tions has been decisively and extensively applied in several domains of mathe- 
matics, statistics, economics and mechanics. Totally positive kernels arise nat- 
urally in developing procedures for inverting, by differential polynomial operators 
[7], integral transformations defined in terms of convolution kernels. The theory 
of Pélya type distributions is fundamental in permitting characterizations of 
best statistical procedures for decision problems [8] [9] [13]. In clarifying the 
structure of stochastic processes with continuous path functions we encounter 
totally positive kernels [11] [12]. Studies in the stability of certain models in 
mathematical economics frequently use properties of totally positive kernels 
[10]. The theory of vibrations of certain types of mechanical systems (primarily 
coupled systems) involves aspects of the theory of totally positive kernels [5}. 

In this paper, we characterize new classes of totally positive kernels that arise 
from summing independent random variables and forming related first passage 
time distributions. 

A function f(z, y) of two real variables ranging over linearly ordered one 
dimensional sets X and Y respectively, is said to be totally positive of order k 
(TP,) if for all a < 22 << +--+ < am, < Y2 < +++ < ym, (ie X;y;e Y) and 
alll Sm&k, 


\f(zi,yx) fltiyye) -** Slti, ym) 
(1) y (srs 130) On) S (a2, Ye) ; ae I (22, Ym) 
Yrs Y25°** 5 Ym : ' : 


wig S(Zm 5 Yr) a S( Lm 5 Ym) 
Typically, X is an interval of the real line, or a countable set of discrete values 
on the real line such as the set of all integers or the set of non-negative integers; 
similarly for Y. When X or Y is a set of integers, we may use the term “sequence” 
rather than “function.” 

A related, weaker property is that of sign regularity. A function f(z, y) is 
sign regular of order k, if for every x, < %2 < +++ < 2m,ti < Ya <*** < Ym, 


and 1 S m GS k, the sign of 


ZX ees PO Pre 
(Aer Cae ae) 
depends on m alone. 


If a TP, function f(z, y) is a probability density in one of the variables, say z, 
with respect to a o-finite measure u(x), for each fixed value of y, then f(z, y) 
Received August 24, 1959; revised April 1, 1960. 
! This work was supported by the Office of Naval Research under Task NR 047-019. 
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is said to be Pélya type of order k (PT,). The concepts of PT; and PT, densities 
are familiar ones. Every density characterized by a parameter is P7,; while 
the PT, densities are those having a monotone likelihood ratio [13]. 

A further specialization occurs if a PT, kernel may be written as a function 
f(x — y) of the difference of z and y where z and y traverse the real line; f(~) 
is then said to be a Pélya frequency density of order k (PF;). 

Finally, if the subscript « is written in any of the definitions, then the property 
in question will be understood to hold for all positive integers. 


2. Summary of Results. From Lemma 3 below we trivially obtain the result 
that if fi , fe, +--+ are density functions of non-negative random variables with 
each f; a PF, , then g(n, x) = f; * fe # -+-+ # f,(x) (* indicates convolution) is 
PF, in differences of z for each n > 0. One of the key results of this paper is 
that under the same hypothesis g(n, x) is PT, in the variables n and z, where n 
ranges over the positive integers and z traverses the positive real line. That is, 
total positivity in translation variables (differences of the argument) for each 
density implies total positivity in the pair: the argument and the order of the 
convolution. (Theorem 1 of Section 4.) 

As an easy consequence, we obtain that 


h(n, 2) = PIX; 2 21, 
re | 


where the X, are independent observations from the corresponding f;> 
i = 1, 2,---, is 7’P, in the variables n and z. The kernel A(n, x) can be inter- 
preted as the probability that first passage into the set |x, < ) occurs at or before 
the nth transition where the successive partial sums S, = ) 7, Xi, 
n = 0, 1, 2, --- (So = 0) describe a discrete time real valued Markov process. 
If X, are not identically distributed then the process is not time homogeneous. 
In this formulation the statement concerning the first passage probability func- 
tion can be extended to the case of random variables ranging over the whole 
real line. Thus Theorem 2 of Section 4 asserts that for Pélya frequency densi- 
ties of a given order, the probability that first passage into the set [z, © ) for the 
stochastic process of successive partial sums occurs at the nth transition, is a 
totally positive function in the variables n, z of the same order. In this frame- 
work, Theorem 1 can be deduced from Theorem 2 by employing a suitable lim- 
iting argument. Further results of this sort are given in Section 4 and Section 
5. 

A different kind of characterization is given in Theorem 8 of Section 6. There 
it is shown that g(n, x), the n-fold convolution of a PF, density extending over 
the whole real line, although not possessing the ful! variation diminishing prop- 
erty of a TP, function, does possess a restricted variation limiting property. 
Specifically, }-7.. ag(n;, 2) has at most 2(m — 1) sign changes, where 


my <1 < +++ <a, m S (k + 1)/2, 


2 
and the a; are real non-zero constants. 


? The number of sign changes V (f) of a real valued function f is sup, <;<,, V(f(z:)) where 
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In Section 7 we establish several smoothening properties possessed by the 
kernel f(z) (the n-fold convolution of f), when it defines a linear transforma- 
tion. In particular, we prove that if f(z) is PF, and g(x) is convex (concave) then 
h(n) = Sf (x)g(2) dz is convex (concave): This fact is useful in applications. 

In Sections 8 and 9 various applications of these results are noted. The inven- 
tory problem discussed in Section 8 originally motivated the theoretical results of 
the present paper; it is exposed here to illustrate the kind of applications made 
available by exploiting the theorems of Section 4. It is possible to show with the 
aid of Theorem 1 that the objective function of the inventory problem is concave, 
so that its maximization becomes a relatively easy task and can be reduced to 

a rather standard non-linear programming calculation. 

In Section 9, a number of totally positive functions are constructed by forming 
successive convolutions of Pélya frequency densities and then applying Theorem 
1. As an illustration of the theory we obtain that g(n, z) = (z — A)" for 
xz > A, and 0 for z Ss A, is TP,, in z and n, provided A, is any increasing func- 
tion of n and K, is any strictly increasing integer-valued function of n. 

In a subsequent publication, Karlin will indicate other generalizations and 
applications of the results of this paper to the theory of stochastic processes and 
orthogonal polynomials. For example, we will extend the results from a discrete 
time formulation corresponding to integer convolutions to a continuous time 
stochastic process structure. In this framework the present theory bears a close 
relationship to some recent studies of Karlin and McGregor [11] concerned 
with totally positive kernels and diffusion processes. We will also develop further 
the connections of total positivity and absorption and recurrence probabilities 
for the state variable of certain kinds of stochastic processes. 

In [15], Proschan has discussed in detail the inventory model described in 
Section 8 with applications to some concrete examples. Theorem 1 plays a crucial 
role in this study. 

3. Preliminaries. Many of the structural properties of TP, functions are 
deducible from the following identity, which appears in [14], p. 48, problem 68: 

Lemma 1: If r(x, w) = [p(z, t)q(t, w) do(t) and the integral converges abso- 
lutely, then 


ee 


Wi,We,***, ty<tge ++ +<ty 


TiyT2,*** y Xe he 
P q ) at dao(t,). ++ +da(t,). 
hi,t,-** sh Wi, W2,°** » We 
In particular, we secure from Lemma 1, the following useful result: 
Lemma 2: Jf f(z, t) is TP, and g(t, w) is TP,, then h(z, w) = 


(2) 


V (f{(z;)) is the number of sign changes of the sequence f(z,), f(z2), --- , f(z) with z; chosen 
arbitrarily from the domain of definition of f and arranged so that z; < zr: < --- < 2, and 
m any positive integer. 
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We shall exploit this result principally in the case when f and g are Pélya 
frequency densities: Therefore, 

Lemma 3: If f(x) is PF, and g(x) is PF, , then h(x) = ff(x — t) g(t) dt 
18 PP ain(m.s) . 

An important feature of totally positive functions is their variation diminish- 
ing property: If f(z, w) is TP, and g(w) changes sign 7 S k — 1 times, then 
h(x) = ffl, w) g(w) de(w) changes sign at most j times; moreover, if h(x) 
actually changes sign j times, then it must change sign in the same order as 
g(w) as x and w traverse the real line from left to right [8] [9]. This distinctive 
property underlies many of the applications mentioned above. The variation di- 
minishing property is essentially equivalent to the determinantal inequalities (1). 


4. Convolution of Non-Negative Random Variables. We first prove 

THeoreM 1: Let f, , fe, --- be any sequence of densities of non-negative random 
variables, with each f; a PF,. Then the n-fold convolution g(n, x) = 
fi *foe---«@f,(x) is PT, in the variables n and x, where n ranges over 1, 2, --- 
and x traverses the positive real line. 

Proor: The proof employs induction. First note that g(n, x) is PT; since 
g(n, x) 2 O for each real xz and each positive integer n. 

Assume now that for every sequence of densities satisfying the hypothesis, 
the corresponding n-fold convolution has been proven PT,_, for r S k. We prove 
that this implies g(n, xz) is PT,. 

(a) First consider the case mn, = 1. Given 1 < my < ny <---<n,,0 5% 
< %2 < +++ < 2, we may write 


(3) he eye Sei ee i“) 


Yi, Te, *** 5 De *** 5 Det, De4ty °° * y Te 


simply by expanding the determinant on the left by its first row. Next note that 
forn = 2,3,---andz 2 0, 


(4) g(n,z) = | oxin — 1, &)fi(z — €) dé, 


where gi(n — 1, &) is defined as fy + f; « --- # f,(). Applying (2) in (4), we 
may write 


— . mr) a [[--] 
Zig *** » Dy, Dept» °° * » De 
OSb1<ia<++*<Epo1 


(5) g Se ee a eee) 
” £,&,-°°° 5 m2 : &, &, ine » be . 


dé, d& +++ d&—1. 
Inserting (5) into (3), we get immediately, 
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Nec nail aid eed x Wimietoroae inter 


(6) O<Ei<Eg<-+*<Er1 


- _1)\"1 Tis *** y De—ty Dept, °° * y Te ( eos = 
a ( 1) 1(2y)fi oo ha i dg, 1g: dé,1 


[f--f af Se ct eas ') 


OSf1<ba<-++<Epas 
Ti, Ze, re 1.8 
7 a CL ane 


But o( ™ “i, . wy . re ') 2 0 by the induction assumption, while 
1 ’ 9° °° 9 Sr—t 


(@ ‘y ome e Je 2 O since f, is PF, by the hypothesis of the theorem singe 
> Sl> » Srl 


Osh <& <--- < &,. Hence 
1, M2, °** 5 Me 
(7) SS -oswe 
(b) Now suppose nm, > 1. Then for any m < m < --- <mand0 S 2 < x 
<-++ < 2, we may write, using (2) and (4): 


maroon [f-- fo(™ es sateen) 


’ , 
Ei<ia<-++<€, 


zs 1, %a,°**s 
ners 0) es dy S- 


From (8) we see that for every sequence of densities satisfying the hypothesis, 
the corresponding functions g; , g satisfy 


%— 1,m— 1-->,&- 1 M,Ma,***, 
o) a(? he ) zoo (mem 1m) zo. 
Using (7) and (9), it follows by induction that o(™ ee er 2 0. 
U1, Tay °** » De 

Since g(n, x) has thereby been proven PT, , we have established the validity 
of the induction step, and the theorem follows. 

It is important to emphasize the distinction between Lemma 3 and Theorem 1. 
Under the hypothesis of Theorem 1, Lemma 3 states that for each fixed positive 
integer n, g(n, x) is PT, in differences of z, while Theorem 1 states that g(n, x) is 
PT, in the variables n and z. 

Will Theorem 1 hold if the random variables are not restricted to be non- 
negative? In general, the answer is no, as the following example shows. 

Exampe: Let fi(x) = fox) = «++ = (1/+/2e) "a PF... Then 


g(n, xz) = f(z) = (1/+/2en) ™. 
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For 1 S m < mm, 2% < 2, the second order determinant is positive for 
0 S a < a and negative for z, < z, S 0. Thus g(n, x) is not PT; . 

However a generalization of Theorem | to the case of random variables ranging 
over the whole real line is possible, as developed in Theorem 2 below. In the 
more general case, total positivity holds, not for the n-fold convolution, but 
rather for the first passage time probabilities of the partial sum process. 

THEOREM 2: Let f, , fo, «+> be any sequence of PT, densities of random variables 
X,, X2, +++ respectively, which are not necessarily non-negative. Consider the first 
passage probability for x positive: 

n 3 
h(n, 2) = P| Xz 2; EX <5 j=1%--,n-1| 
forn = 1,2,---. 
Then h(n, x) is TP, , where n ranges over 1, 2,--- and x traverses the positive 
axis. 

Proor: The proof proceeds in a similar fashion to that of Theorem 1. We em- 
ploy induction. First we note that h(n, x) is TP; since h(n, xr) 2 O by its 
very meaning. 

Assume now that for every sequence of densities satisfying the hypothesis, 
the associated first passage time probability function is 7P,, for r S k. We 


shall prove that this implies h(n, x) is TP,. From this the conclusion of the 
theorem will follow. 


We clearly have for z positive that 


'/ f(E) dé for n 


f(x — E)hi(nm — 1, &) dé forn = 


(10) h(n, xz) = - 
( 


where 
n 


hin — 1,8) = P[ Ee : , j=2,8,--,n—1], 


tom? 


We consider first the case n, = 1. Given 1 < m < ny < +++ < n,, 
ay < % < +++ < 2,, we may write, using (10), 


niftrterten.:”"+%) 
1, Yo, My *** y Dr 
if” r ef’ 
ff aat+oad [natoae-- [ hatoa 
h(ne, 21) h( ne, 22) ee h( ne, 2) 


h(n,, 21) h(n, , 2) ee h(n, , 2,) 


on : *) 
dé. 


r 


a S(~47"* [ ne. + eh ( 


Ti; 9 Zot: Lop, °** » De 
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Now using (10) and (2), we obtain 


eee ) iff 
iy *** » Doty Zo4iy °° * » De 


OS f1<Ea<--+<Ee-1 


(11) f (So - ser ee ae )h a l,m —1,-:-,%=— ') 
' €:, &,°°° » St P fi, &, °° » Sr—a F 


dé, dg --+ dé. 
Inserting (11) in the equation above and replacing —é by &, gives 


p{ ':%2 Mer --* Mm) [TJ 
%, D2, Wa, *** , De 


EOS b1<Es< ++ *<Er 1 


« %—-1lm-—1,°--,%-1 ~ _4)\"1 oe 
ete , br-1 yx 1)” fila g). 


ven) 


Bay °° * 5 Doty Dotty °° ° y By a 
f a >, o. : ee dg dé, dé, dt,_4 


ott ee ae 


E<OSE1<Ea<++ << Epa 


‘fi S ed a dé dé, dés. ook ie 


But i ee © fe eee ') 2 0 by the inductive assumption, while 


&i » 2 ,*** Bos 
on: “7 = Osincet < & <& < +++ <b, <n Ss Se, 


and f,(z) is PF, by hypothesis. Hence af my Ms ee °°" “i 2 0. 
5%, %,°*** > Tr 


The remainder of the proof parallels the corresponding portion of the proof of 
Theorem 1; simply replace g by h. 

It is appropriate to compare Theorem 1 and Theorem 2. For this purpose we 
sketch an argument which shows that Theorem 1 is actually a limiting case of 
Theorem 2. A careful examination of the preceding argument reveals that in the 
case of non-negative PF, random variables, the probability of first passage at 
time n into any positive interval, not only the interval [z, ~], is TP, . In view 
of this fact we shrink the interval to a point, and it readily follows that the first 
passage time probability converges to the density corresponding to the n-fold 
convolution. Since total positivity is preserved under this limiting operation, 
Theorem 1 follows. 

We now develop a series of consequences of Theorems 1 and 2. Let F(z) be 
the cumulative distribution function corresponding to f,(z),i = 1, 2, --- . Then 
as a direct corollary of Theorem 2, we have 

TueroremM 3: Under the assumptions of Theorem 1, 


h(n, z) = Fi *Fye--- #Fyi(z) — Fie Fy e--- F(z) is TP,, 
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where n ranges over 1, 2,--- and x > 0. In particular, if f, 
then h(n, x) = F°*™ (x) — F'" (x) is TP, . 
Proor: Simply note that 


h(n,z)=P(DmXi22; DX <2) 
since the random variables are non-negative. 

Actually we can say more about A(n, x) in the situation where the X, are non- 
negative, independent, and identically distributed random variables; Theorem 4 
asserts that for each fixed z > 0, h(n + m, z) is sign regular in the variables 
n= Oandm 2 0. 


Turorem 4: Suppose f(x) is PF, with f(x) = 0 for x < 0. We define h(n, x) 


by h(n, x) = F(x) — F(x) forn = 1,2, +--+ ;2 2 O, and for fixed x = 0 
we define 


\h(n, x) : = 1 
c(n) = ; 


0. 


Then c(n + m) is sign regular of order k inn 2 1 and m 2 1; moreover, for 
lsSm <m< +++ <n,,1 Sm < m <--- < m,, the sign o 
’ 


hy , Me, °° * yh ° r(r—1)/2 Thy , Me, *** 
Cy , nee is (—1)™"", where c, , 
™m,, Me,°**, M, 7%, Me, °**°* 


c(m +m) --- c(n +m,) 


c(n, +m) <+-- e(n, + m,) 
Proor: For m 2 1 and n 2 1, we have 


(12) c(n + m) = [ o(m, e)h(n, x — &) ae; 

where g(m, £) = f”(). (12) simply states that if the partial sum first exceeds 
x at the n + mth stage, then this can occur by having the mth partial sum equal 
to some non-negative £ < 2, while the partial sum starting with the m + Ist 
variable first exceeds x — & at the nth stage. From (12) and (2), we get, for 
lsom<m<-++<n,,1 Sm < m<-::-<m,rsk, 


wee) nd fin fo oftgetge st) 
mm, M2, ***, M, Si, Sen °*ss & 


OSfi<ia<: + -<E-<z 


ny » Ne , thts Me a 
a(? —h,r2—&,°°°,2- _,) dirs di, , dé,. 


By Theorem 1, of eet o 2 0. Since ther —&,2—-&,--:,2—-& 


& , &, “ide 
are in decreasing order of magnitude it follows, invoking Theorem 3, that 





POLYA TYPE DISTRIBUTION OF CONVOLUTIONS 


Mn & yt b a ) has the sign (—1)""-””*. Thus 


,rt— & 


te Me, ***, ") 
C+ 
m,, M2, ooo MM, 
has the sign (—1)"””? as was to be proved. 
We prove below that c(n) has the property of being PF, provided f(r) is 
PF, (Theorem 5). This is the property required for the analysis of the inventory 


model of Section 8. In contrast, this relationship between c(n) and f(z) does not 
persist beyond the second order. 


TueoremM 5: If f(x) is PF, with f(x) = 0 for x < 0, then e(n) (defined in 
Theorem 4) is PF, . 


Proor: Let ny < ne, m < m. Write 
(™ ") ‘e(m, — m) (nm, — m2) 


mi, ™ c(nz — m) cl, — me) 


(a) If my S m, then c(m — m) = 0, so that c tof =) = 0. 
m, , ms. 


(b) If mn > me: , we must have m < m: < nm < nm. Hence 
o( =) h(n — m, xz) h(n — m, 2) | 
m,, M, h(m, — m,2z) h(m — m, x) 


tha — m,,£) h(m — m, z)| 


g(m, — m,, x — &) dé. 
h(m%_ — m,§&) h(m — m, z)} : 


Since § < 2, m% — m < ™% — m, and h(n, z) is TP, by Theorem 3, then 
c (> ' a 0 and the proof is finished. 


m,, Me 

5. Compound Distributions. As an easy corollary of Theorem 1, we have 
corresponding determinantal properties for compound distributions composed 
from PT’, densities. Specifically: 

Turorem 6: Let X; = 0 be distributed with density f;(x),a PF, ,i = 1,2,---. 
Define S, = >-%, X;, where N is a random variable independent of X;,X2, +--+, 
with density d(n, u), where u is a parameter, and d(n, ») is PT, in the variables 
n and wp. Then r(x, w), the probability density for Sy , is PT, in the variables x > 0 
and u. 

Proor: r(z, w) = Star PIN = ni fie fee +--+ f.(%) = SS d(n, pz) 
g(n, x). By Theorem 1, g(n, z) is PT, . Applying Lemma 2 we conclude that 
r(x, p) is also PT,. 

In a similar fashion, we may study transforms of g(n, z) in the variable z; the 
proof is as in Theorem 6. 

TuHeoreM 7: Jn addition to the hypothesis of Theorem 1 assume that ¢(z, 8) is a 
PT, function. Then o(n, 8) = fg(n, r)e(z,s) dria PT,. 
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As an illustration, let g(z, s) = e“, —2 < s S 0, so that o(z, s) is PT... 
From Theorem 7, we have that ¢(n, s) is PT, in the variables n and s. But ¢(n, s) 
is the Laplace transform of the convolution of n densities and so we have 
o(n, 8) = od: (8)d2(8) --- on(s), where o(s) = ffi(r)e" dz, i = 1, 2,---. In 
particular, we obtain the interesting set of inequalities: 


| (81) O1(81)G2(8:) --- rl Bidde(Bi) ++ bm 81) 
idu(82) —i(8)da(H2) ---  du(a)oa(t) +++ dnl) | > 0 


$1(8m) 1(8m)G2(8m) «=* rl Bm)G2(Bm) ** bm (8m) | 


where 8; < & < --- < 8m 3S 0,m Sk, (8) = Sfi(x)e” dz, and f(z) isa PF, 
density with f(z) = 0 for z < 0,7 = 1,2,---,k. 


6. Convolution of Random Variables Ranging Over the Real Line. We have 
seen on the basis of the exampie following Theorem 1, that the n-fold convolution 
g(n, xz) of a PF, density whose possible values extend over the whole real line, is 
not necessarily PT, . Thus, in generalizing Theorem 1 to densities whose possible 
values extend throughout the real line, it was necessary to formulate the problem 
in terms of first passage probabilities rather than n-fold convolutions. However, 
the question remains: what smoothening properties are possessed by the n-fold 
convolution of a PF, density, which has possible values ranging over the full real 
line. We can answer this query in terms of a weakened version of the variation 
diminishing property possessed by totally positive functions. Recall that if 
p(x, w) is TP, and q(w) changes sign 7 S k — 1 times, then 


r(x) = fp(a, w)q(w) dF(w) 


changes sign at most j times; moreover, if r(z) actually changes sign 7 times, then 
it must change sign in the same order as does q(w) [9]. This variation diminishing 
property may be compared with the following result. 

Tueorem 8: Let f(x) be a continuous PF, , with f(x) not necessarily 0 for z < 0. 
Let tn(x) = >a ag(n;, x), where nm < ng < --+ < tm, m S (k + 1)/2, and 
the a; are real non-zero constants. Then r(x) has S2(m — 1) sign changes. 

Proor: We proceed by induction. The theorem trivially holds for m = 1. 

Assume the theorem holds for the case of a sum consisting of m) — 1 terms, 
where m S (k + 1)/2. Write 


Tng(Z) = S a.9(n, 2) = ya | vn, — m, 4)g(m, xz — 6) dé 


+ a, lim ge(0, 6)g(m, z— 6) dé, 


R+e 


where 


1 
gn(0, 8) - for OSOS R 


0 otherwise. 
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Factoring, we get 


(14) ra (z) = lim {ix a;g(n; — m, 6) + arge(0, a} g(m, z — 6) dé. 


By the inductive hypothesis, >> ag(n; — nm, 6) has at most 2(m» — 2) sign 
changes as a function of 6. With R sufficiently large, aige2(0, 6) can introduce at 
most 2 additional sign changes. Thus for sufficiently large R, 


mo 
2 ag(n; — m, 6) + age(0, 0) 


has at most 2(m» — 1) sign changes. Since g(m , z — 6) isa PF, , and therefore, 
variation diminishing, we obtain that the integral of (14) possesses at most 
2(m> — 1) sign changes as a function of z. Taking the limit as R — ©, the number 
of sign changes cannot increase, and thus the number of sign changes of ra,(z) is 
S2(m, — 1). 

Applying induction, we conclude that the theorem holds for m = 1, 2, ---, 
(k + 1)/2 and the proof is finished. 


7. Preserving Convexity and Concavity. Let X; 2 0 be independent random 
variables distributed according to f(z), a PF, . We now describe some further 
smoothening properties possessed by the transformation which maps functions 
into sequence, viz. 


hn) = [ $°(x)g(2) de ewe, 2 é.-. 


We show first that the property of convexity is preserved under this trans- 
formation. Explicitly, we prove that convexity in g(z) is carried over into con- 
vexity in h(n). This will be demonstrated not only for the ordinary notion of 
convexity, but for a type of convexity of higher order, which notion is made 
precise below. Similar results hold for concavity. 

Assume f(z) is PF; and g(x) is convex (of order 2). Let uw; = J 2'f(2) dz, 
i = 1,2, --- represent the moments of X. Note that for arbitrary real constants 
dy and a; , 


/ (g(x) — [(ao/u:)2 + aif (x) dz = h(n) — (aon + a). 


Since g(x) is convex, then g(x) — [(a/#:)z + a) has at most 2 changes of sign 
and if 2 changes of sign actually occur, they occur in the order + — + as z 
traverses the real axis from — «= to +. Since f is PF, , then by Theorem 1, 
f'” (2) is PT; in the variables n and z. 

By the variation diminishing property of Pélya type functions, we infer that 
h(n) — (an + a) will have at most 2 changes of sign. Moreover, if 
h(n) — (aon + a,) has exactly 2 changes of sign, then these will occur in the 
same order as those of g(x) — [(ao/u:)z + a], namely + — +. Since a» , a; are 
arbitrary, we easily infer that h(n) is a convex function of n. 
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In a similar fashion we can show that higher order convexity is preserved under 
this transformation as follows: A function g(x) is said to be convex of order r if 
for an arbitrary polynomial p(z) = ager’ + ax”? +--- + a. of degree 
r— 1,g(z) — p(x) has at most 7 changes of sign, and if r changes of sign actually 
occur, they occur in the order + — + ---. 

Assume that f(x) is PF,,, and g(x) is convex of order r. Note that fz‘f‘”(x) dz 
= B(X,+---+X,)* = yin* + lower powers of n. It follows immediately 
that for an arbitrary polynomial g(n) = agn™ np an” * + +--+ + a,_, of degree 
r — 1, there exists a polynomial p(x) = bor” + bx” * + --- + b,. of degree 
r — 1 such that fp(x)f (x) de = q(n), and hence f{g (x) — p(x)\f"(x) dx 
= h(n) — q(n) with ag > 0. Since f(z) is PF,4,, then by Theorem 1, f‘” (x) 
is PT,4, in the variables n and z and again by the variation diminishing prop- 
erty of Pélya type functions, we obtain that h(n) — q(n) will have no more 
changes of sign than g(x) — p(x). But g(x) — p(x) has at most r changes in 
sign since g(x) is convex of order r; and soh(n) — q(n) has at most r changes of 
sign. Moreover, if h(n) — q(n) actually hasr changes of sign, then they will oc- 
cur in the same order as those of g(x) — p(x), namely + — + --- . Thus A(n) 
is convex of order r since q(n) was an arbitrary polynomial of degree r — 1. 

Similar results apply to concavity of higher order. A function g(x) is concave 
of order r if for an arbitrary polynomial p(x) = age” * + az” * + --- + a, of 
degree r — 1, g(x) — p(x) has at most r — of sign, and if r changes of sign 
happen then they occur in the order — + — -- 

An application may be made to the inventory suedel discussed in [2], p. 227. 
The probability density of demand for each period is f(£), a PF; . The policy 
followed is to maintain the stock size at a fixed level S which will be suitably 
chosen so as to minimize appropriate expected costs, or is determined by a fixed 
capacity restriction. At the end of each period an order is placed to replenish 
the stock consumed during that period so that a constant stock level is main- 
tained on the books. Delivery takes place a periods later. The expected cost for a 
stationary period as a function of the lag is 


L(a) = [ms — z)f(z) dz + [ p(z — S)f(z) dz 


where S is fixed. 
Assume now that A and p are convex increasing functions with A(0) = 
p(0) = 0. Then we may write L(a) = f r(z)f (2) dz, where 


hA(S — z) for OsS2zsS8S 
r(z) = 
| p(z — 8) for S <z. 


Then r(z) is a convex function. Using the preceding results, we conclude that 
L(a) is a convex function. Thus, if the length of lag should increase, the marginal 
expected loss increases. 
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Similar results hold if p and A are concave. Also, if we assume f is PF,., and 
p and h are convex (concave) of order k, we may conclude that L(a) is convex 
(concave ) of order k. 


8. Application to an Inventory Problem. We wish to determine the initial 
spare parts kit for a system, which maximizes assurance of no shortage whatso- 
ever during a period of length t, under a budget for spares c . We consider only 
essential components, and assume that a failed component is instantly replaced 
by a spare, if available. Only spares initially provided may be used for replace- 
ment. The system contains d,; operating components of type 1,1 = 1, 2, --- , k. 
The length of life of the jth operating component of the ith type is an independent 
random variable with PF, density f;; ,j = 1, 2, --- ,d;. The unit cost of a com- 
ponent of type 7 is c, . 

Our problem is to find n;, the number of spares initially stocked of the ith 
type, i = 1, 2,---, k, such that []/.. P:(n,) is maximized subject to 


k 
Li nei S and n; = 0,1,2,--- for i= 1,2,---,k, 
where ?;(m) = probability of experiencing Sm failures of type i. (See [3], [15] 
fora detailed discussion of this model and its application to reliability ; our present 
treatment is confined to aspects of the problem relevant to the present paper. ) 
In [3] and [15], methods are given for computing the solution when each 
In P;(m) is concave in m, or equivalently, when each P;(n — m) isa TP, sequence 
in n and m. To show P,;(n — m) isa TP, sequence in n and m, we note: 
1. c,;(n), the probability of requiring n replacements of operating component 
i, j, is a PF, sequence in n for each fixed i, 7 by Theorem 5 above. 
2. p.(n), the probability of requiring n replacements of type i, is a PF, se- 
quence in n for each i by Lemma 3, since p,(n) = Ca # Ce * +++ * Cia,(n). 
3. Pi(n — m) is a TP: sequence in n, m for each 7, since 


P(n) = > pn —m)q(m), where q(m) 


(a) (1 for m=0,1,2,--- 


.0 otherwise, 

(b) g(m) is a PF, sequence. 

(ce) The convolution of PF, sequences is PF, , by Lemma 3. 

Thus when the underlying densities for the life of components are PF, , the 
methods given in [3] and [15] for obtaining optimal kits are applicable. 


9. Generating Totally Positive Functions. In this section we give a series of 
examples of the above theorems. These theorems are written in terms of real 
valued random variables but it should be emphasized that all our results are 
equally valid for integer valued random variables. The underlying densities 
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are assumed to be the appropriate PF, sequences. The first few illustrations 
involve integer valued random variables. 

EXAMPLE 1: 

(a) Let 


q for k=0 
S(k) =4p for k =1, where p+q= 1. 
0 for other k 


then f(k) is a PF,, sequence by direct verification. Alternately, we may appeal 
to a classical result of Schoenberg and Edrei which asserts that a sequence is a 
PF sequence, if and only if its generating function is of the form 


e“TIT{(1 + as)/(1 — 68)},y 2 0,a,20,8,20; Da and D8, 
convergent. (See p. 305, [6].) Applying Theorem 1, we obtain that the binomial 


density g(n,k) = f(k) = 2 pq" ‘is PT. . It follows that (7) is TP. inthe 
variables n and k. 


A direct proof, in this case, is easy. For some of our further examples the result 
is less apparent 
(b) Let 


1/(l+p') for k=0 

Sik) =4p'/1+p') for k=1 

0 for other k, 
i = 1,2, --- . As pointed out in (a) above, each {f;(k)},.0,,... isa PF, sequence. 
Hence, by Theorem 1, g(n, k) = fi * fo *--- * f,(k) is PT... But g(n, k) is 


simply the coefficient of s* in the generating function [] 7. ((1 + p's)/(1 + p')] 
of the n-fold convolution. Using the Gauss identity 


IT (1+ p's) = & "| od, 


tml i 


where, by definition, 


[| = {(l1—p")(l1—p"') --- A—p””™)/— p)(1— pp) --- (l—p) 


for »Sn, 


we find that the coefficient of s* in []?s[(1 + p's)/(1 + p’)] is le pene / 


[[2-: (1 + p’). Since ge is a function of k alone while [] 7.1 (1 + p’) isa 


function of n alone, we conclude that [| is TP,,. Note that is a type of 


generalization of the binomial coefficient (4) since for p — 1, le => (r) ; 


k k 
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(c) Let f(k) = q'p,p+q=1,k = 0, 1,2, --- ;f(k), the geometric density, 
is the probability that the first success in a sequence of Bernoulli trials occurs 
following k successive failures. The corresponding generating function is 
p/(1 — qs). By [1], p. 305, f(k) is PF. . Now 


g(n,k) = f™(k) = (" + : sai pre, 


so that g(n, k) represents the probability that the nth success occurs at trial 
n + k in the sequence of Bernoulli trials. By Theorem 1, g(n, k) is PT. . Since 
p” is a function of n only, while ¢* is a function of k only, we obtain that 
eee < ') is TP.. 

(d) Next, let f(k) = g*(1 — q*), k = 0,1,2,---,i = 1,2,---. As noted 
in (c), each {f,(k)},.0,... is a PF, sequence. Hence 


g(n,k) = fiefre--- # fa(k) 


is PT. by Theorem 1. But g(n, k) is simply the coefficient of s* in the generating 
function [[71{(1 — q°)/(1 — q’‘s)]. Using the Heine hypergeometric relation, 
[6], p. 8, 


/ yo do = Lette nen, 


where the symbol {m] is defined equal to {[(1 — q”)/(1 — q)]. We find that the 
coefficient of s* in the generating function is 


J Tek — 1 <= fj 


Since [] 7.1 (1 — q°) is a function of n alone, we obtain that 


{fla-m}etie +a e+e 


[n + 1][n + 2] --- [n+ &]. 
 Me--- eae 


Next we consider an example of the application of Theorem 1 to continuous 
densities 
(e) Let 


, J (= a,;)***'e**° /T(k;) for 22a, 
fz) \ 0 for r<a,, 


where k, is a positive integer, a; 2 0,1 = 1, 2,--- ; thus f,(z) is a translated 
garoma density. Then the characteristic function of f;(z), 


¢(t) = [ e*((2 — a)" )/T(k,) dz = &*/(1 — at)". 
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Defining g(n, z) = f, * fe # --- * f,(z), we have for its characteristic function 
exp [it >- fos ajl/(1 — it)2:21 4; and consequently 


_ Jf ((z— A,)** 6 * *”) /T( Kz) for z2A, 
a(n, 2) ={ 0 for t<A,, 


where A, = > 7.,a;and K, = 5-2, k;. This means that g(n, z) is also a trans- 
lated gamma with parameters corresponding to the sum of the individual 
parameters. 


Since each f; is PF, , we may conclude that g(n, x) is PT. in the variables n 
and z by Theorem 1, or equivalently, factoring out e~* and e**/T'(K,), that 
(x — A,)**" is TP,,. Note that by appropriate selection of the a; and the k, 
we may achieve for A, any increasing function of n and K, — 1 may likewise 
denote any strictly increasing integer-valued function of n. 
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THE POISSON APPROXIMATION TO THE POISSON BINOMIAL 
DISTRIBUTION' 


By J. L. Hopegs, Jr. anp Lucien Le Cam 
University of California, Berkeley 


1. Introduction. It has been observed empirically that in many situations 
the number S of events of a specified kind has approximately a Poisson dis- 
tribution. As examples we may mention the number of telephone calls, accidents, 
suicides, bacteria, wars, Geiger counts, Supreme Court vacancies, and soldiers 
killed by the kick of a horse. 

Many textbooks in probability content themselves with an explanation of this 
phenomenon that runs something like this: There is a large number, say n, 
of events that might occur—for example, there are many telephone subscribers 
who might place a call during a given minute. The chance, say p, that any 
specified one of these events will occur (e.g., that a specified telephone sub- 
scriber will call), is small. Assuming that the events are independent, S has 
exactly the binomial distribution, say @(n, p). If we now let n — © and p — 0, 
so that np — d where d is fixed and 0 < A < @, it is shown that @(n, p) tends 
to the Poisson distribution ®(\) with expectation \. 

As was pointed out by von Mises [4], such an explanation is often not satis- 
factory because the various trials cannot in many applications reasonably be 
regarded as equally likely to succeed. Let p; denote the success probability of 
the ith trial, i = 1, 2,---, mn. Then S has the distribution sometimes called 
“Poisson binomial.” Starting from this more realistic model von Mises shows 
that S has in the limit the distribution @(\), provided n — © and the p,; vary 
with n in such a way that 2p; = d is fixed and a = max|p,, por, -** , Pal tends 
to 0. This result is given in a few textbooks [1], [5). 

The limit theorem of von Mises suggests that the Poisson approximation will 
be reliable provided that n is large, a is small, and \ is moderate. But even 
these requirements are unnecessarily restrictive, as may be seen from a general 
approximation theorem of Kolmogorov [2]. When this theorem is applied to our 
problem, it asserts that there is some constant C, independent of n and the p, , 
such that the maximum absolute difference D between the cumulative distribu- 
tions of S and of ®(Zp,) satisfies the bound D < C Wa. Thus, the Poisson ap- 
proximation will be good provided only a is small, whether n is small or large, 
and whatever value 2p; may have. It seems to us that this is the type of theorem 
that best “explains” the empirical phenomenon of the “law of small numbers.” 

The purpose of our note is to present an elementary and relatively simple 
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proof of a bound of Kolmogorov’s type. By using special features of the Poisson 
distribution, we are able to get the improved bound 3a for D, and to accom- 
plish this in a good deal simpler way than is required for the general result. 
We believe that our proof is suitable for presentation to an introductory class 
in probability theory. 


2. The approximation theorems. Let X;, indicate success on the ith trial, 
so that P(X; = 1) = p; and P(X; = 0) = 1 — p,. Our proofs will be based 
on the device of introducing random variables Y; that have the Poisson dis- 
tribution with E(Y,;) = p,;, and are such that P(X; = Y;,) is as large as pos- 
sible. Specifically, we give to X; and Y; the joint distribution according to which 


P(X;= ¥;=1)=pe™, P(X; =1,¥;=0) = p(l —e™), 
P(X; = ¥; = 0) = &™ — p(1 — &™), 


and 


P(X; = 0, ¥; = y) = pie ™/y! for y = 2,3,---. 


We let the Y, be independent of each other. (The construction is valid if p; = 0.8, 
insuring P(X; = Y; = 0) 2 0. For p,; > 0.8 the results below are trivially 
correct. ) 

From the familiar additive property of Poisson variables, we know that 
T = XY, has exactly the Poisson distribution @(Zp,). Our objective is to show 
that S = 2X; has nearly this distribution. Specifically, if we let 


D = sup| P(S Ss u) — P(T Su) | 


denote the maximum absolute difference between the cumulatives of S and T, 
we want to find conditions under which D is small. 

Tueorem 1. D < 22pi. 

Using the inequality e "* 2 1 — p;, it is easy to check that 


P(X; ¥ ¥s:) = 14+ ps — (1 + 2pie™ S 2pi. 


Therefore, by Boole’s inequality, P(S = T) < =P(X; ¥ Y;) S 2Epi. But 
since |P(S Su) — P(T S u)| S P(S #T), the theorem follows. 

In order to prove our next theorem, we shall need a uniform bound on the indi- 
vidual terms p(k, \) = e~A*/k! of the Poisson distribution. It is well known 
that for large 4, the maximum term is of the order \~*, but we will give a spe- 
cific upper bound. 

Lama. The maximum term of the distribution @() ts less than (1 + 1/12d)/ 
(2xd)’. 

Proor. Suppose k < \ < k + 1. The maximum term is then e~d‘/k!, as 
may be seen by looking at the ratio of successive terms. Since (r)tern* is maxi- 
mized at A = k + 4, and since 1 + 1/12A > 1 + 1/12(k + 1), it will suffice 
to show that 


(2n)*(k + 4)*tHe*4 < kil + 1/12(k + 1)] 
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for k = 0, 1, 2, --- . This inequality may easily be checked by direct computa- 
tion for k = 0, 1, and 2, and for k 2 3 by using the Stirling bound 
kl > (Qe) tt et -areee® 


Let us denote Zp, by \ and Epi by u. 
Turorem 2. D S (3y/a") + (a + 1)(1 + 1/12d)/(2eA)*. 
To prove this, we shall consider the random variables Z; = Y; — X,;. 


E(Z;) = 0, 


while 


Var(Z;) = E(Zi) = pl — €&) + SR pte™)/k 
k=’ 


= p(l—e™) + E(Yi) — pe™ = pi +2pdi-e™)s Spi. 
Let 2Z,; = U. Then E(U) = Oand Var(U) S 3. 


Let a be any positive number. If T = S+ U Ss v — a, then either S Ss » 
or U s —a, so that P(T Sv-—a) S&S P(S sv) + P(U S —a) and 


P(T sv) -P(Ssv) SPw-asTsv)+P(U S —a). 
Similarly, if S = T — U Sv, then either T S » + a or U 2 4, 80 that 
P(S sv) SP(T Sv+a)+P(U 2a) 


and P(Ssv)-P(T sv) SPwsTsv+a)+P(U 2a). Combin- 
ing, we see that 


D = sup|P(S s v0) — P(T Sv) | 
Ss sup P(o ST Sv+a)+P(|U\| 2a). 


By the Chebycheff inequality, P(|U| 2a) Ss Var (U)/a’ S 3u/a’. Using 
the lemma, we see that 


sup P(v S T Sv +a) S (a + 1)(1 + (1/12A))/(2ea)', 


since there are at most a + 1 Poisson terms in the interval from v to v + a. 
This completes the proof. 

We now combine Theorems 1 and 2 to obtain our main result. 

Tueorem 3. D < 3W/a. 

We prove this by considering two cases. If 2u S 3~/a, the theorem is an im- 
mediate consequence of Theorem 1. On the other hand, if 2u > 3~/a, we have 


by virtue of u < ad the inequality \ > 3/2a”* > 1. Now suppose that a 2 1. 
Then 


(a + 1)(1 + (1/12a))//[(2ea)4] < af! 
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and 


D S (3n/a’) + (a/’). 


This is minimized when a = Vv 6u(d)! = dp. Since \ > w/a and p > 3(a)'/2, 
we see that u(d)’ > (#)*™* or ap > (3°/2)"* > 1, so the restriction a = 
is satisfied, and the theorem is proved. 


3. Remarks. 


(i) We have presented our results as approximation theorems rather than as 
limit theorems. We believe it is better pedagogy to do so, since in the applica- 
tions there will be definite values of n and the p,;, which are not “tending” to 
anything. However, if limit theorems are desired they follow at once. For ex- 
ample, Theorem 1 implies that D— 0 as u = Zp; — 0, whereas Theorem 3 
implies that D— 0 as a = max {pi,--*, pa} 0. 

(ii) Our Theorem 1 gives a simple and elementary proof of the standard 
textbook result that @(n, p) ~ @(A) as n— ~, p— 0, and np —\, since 
under these conditions Zp; = \p — 0. Furthermore, Theorem 1 implies the 
more realistic theorem of von Mises, since if 2p; = d is fixed while a — 0, we 
must have Zpt S ap; = ai— 0. 

(iii) As is customarily the case with bounds for the accuracy of approxima- 
tions, our bound has only theoretical interest, being much too crude for prac- 
tical usefulness. By pushing the method of proof, the constant factor 3 in the 
inequality D S 3W/a can be reduced, but the result would still be of only 
theoretical value. It can be shown [3], using a much less simple argument, that 
D S %a. While it is clearly a theoretical improvement to have a bound of order a 
rather than one of order ~/a, even the bound 9a is of limited applicational 
use. Fortunately, approximations are usually found in practice to be much 
better than the known bounds would indicate them to be. 

(iv) The condition that a — 0 is sufficient but not necessary for D — 0. 
It is easy to see that S will have approximately a Poisson distribution even 
if a few of the p; are quite large, provided these values contribute only a small 
part of the total =p, . 
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CONFIDENCE BOUNDS CONNECTED WITH ANOVA AND MANOVA FOR 
BALANCED AND PARTIALLY BALANCED INCOMPLETE 
BLOCK DESIGNS' 


By V. P. BuarKar’® 
University of North Carolina 


1. Introduction and summary. It is well known [3, 4] how, in the case of any 
general strongly testable [5] linear hypothesis for either ANOVA or MANOVA 
one can put simultaneous confidence bounds on a particular set of parametric 
functions, which might be regarded as measures of deviation from the “total” 
hypothesis and its various components. The parametric functions are such that, 
in each problem, one of these can be appropriately called the “total” and the 
rest “partials” of various orders. For each problem the “total” function, (i) in 
the univariate case, is related to, but not quite the same as, the noncentrality 
parameter of the usual F-test of the “total” hypothesis in ANOVA, and (ii) 
in the multivariate case, is the largest characteristic root of a certain parametric 
matrix which is related to, but not quite the same as, another parametric matrix 
whose nonzero characteristic roots occur as a set of noncentrality parameters 
in the power function for the test (no matter which of the standard tests we use) 
of the “total” hypothesis in MANOVA. The same remark applies to “partials’’ 
of various orders considered in the proper sense. 

In this note, for both ANOVA and MANOVA, the hypothesis considered is 
that of equality of the treatment effects—vector equality in the case of MANOVA. 
Starting from such a hypothesis, explicit algebraic expressions are obtained 
for the total and partial parametric functions that go with the simultaneous 
confidence statements in the case of both ANOVA and MANOVA and for 
balanced and partially balanced designs. It is also indicated how to obtain, in a 
convenient form, the algebraic expression for the confidence bounds on each 
such parametric function, without a derivation of these expressions in an ex- 
plicit form. 


2. Notation and preliminaries. 
(i) Univariate case. Let x denote a column-vector of n independent normal 
variables with a common variance o* and the means given by 


(1) 6x = Ay x mOm x1; 
where A is a matrix of known constants and 6 is a vector of unknown parameters. 


1 This research was supported partly by the Office of Naval Research under Contract 
No. Nonr-855 (06) for research in probability and statistics at Chapel Hill and partly by 
the United States Air Force through the Air Force Office of Scientific Research of the 
Air Research and Development Command, under Contract No. AF 49 (638)-213. Repro- 
duction in whole or in part is permitted for any purpose of the United States Govern- 
ment. 
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The hypothesis 
(2) Ky: Bx nd = 0 [Rank B = s] 


is said to be strongly testable if Rank (A’, B’) is equal to Rank (A). If we 
write 


(3) Be -_ >, x1» 
then the ‘total’ parametric function, A, associated with KX, is 
(4) A = 9'D"9, 


where Do’ is the variance-covariance matrix of the best unbiased linear estimates 
of . It may be observed that A/o’ is the noncentrality parameter of the F-test 
for 3X. Confidence bounds on A, with a confidence coefficient greater than or 
equal to (1 — a), are then [3, 4] given by 


4 4 
(5) Sh, - | — r, | St s a’ < Sh, + EE r,| St, 
et @g=— Ff 


where r = Rank A, F, is the 100a% significance point of F withd.f. s andn — r 
respectively, Sg, is the sum of squares due to 3X, and S;, is the sum of squares 
due to error. We also have the simultaneous confidence statements 


4 4 
(6) Stam. — | —— F, | Sis Aly S Shaya, +[ —— r.| Si, 
n-?r n-T?T 

where Aq) = daDe $a) » O@) is any subvector of }, D,,) is the corresponding 
submatrix of D and S,.)g, is the corresponding sum of squares due to the partial 
hypothesis H,),: @) = 0. (5) and (6) are implications of (13.2.21) on p. 90 
in {3}. 

In the case of treatment-block designs, we have 


t= 1,2,---,»9, 
j 1, 2,---, 0, 


if the ath observation belongs to the ith treatment and jth block. The hypothesis 
of equality of treatment effects may be expressed as 


(8) Ke: (La, -~Jaa)t = 0, 


where t’ = (4, &,---, &) and J,, = {l},~,. We shall write J,,.,as J,. We 
assume that the design is connected. Let n,; = 1(0) if the ith treatment appears 
(does not appear) in the jth block. Then N = (n,;), x» is the incidence-matrix of 
the design. Let r, k, T and B denote the number of replications of each treatment, 
the number of observations in each block, the vector of treatment totals and the 
vector of block-totals respectively. Then it is well-known [2] that the equations 
for t are 


(9) Ct = Q, 


(7) Erq = ty + b; 
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where C = rl — (1/k)NN’ and Q = T — (1/k)NB. Also 
(10) Cov (Q) = o°C. 


Then, from (3) and (8), ¢; = 4; ~4&,i = 1,2,-+-,v — 1. We may express A in 
a symmetrical form by taking @ = (L., —Je4+:)&, where 

&=t — (/v)(4 ++ --- +46), @#=1,2,---,». 
From (4) 


(11) 4 = &(L4, —Jnaa)’D (La, —Jeaadt 


(ii) Multivariate case. Let X denote a matrix of n independent p-dimensional 
normal variables with a common variance-covariance matrix Z, p being the 
number of characters observed on each individual, and let the means be given by 


(12) 6X, xp = An x nOnx >, 

where @ is a matrix of unknown parameters. Suppose that 

(13) Ke: BOU,~. = 0 [Rank U = u & p) 
is the “strongly testable” hypothesis to be tested. If we write 

(14) BOU = osx, 

then the “total” parametric function, A, associated with X, is (3, 4] given by 
(15) A = Cmax [6’D™' 9]. 


It may be observed that the characteristic roots of 6D (U’ZU)™ are the 
noncentrality parameters in the power function of the test (no matter which of 
the standard tests we use) of the “total” hypothesis given by (13). 

The confidence statement is [3] given by 


8 


4 
Chax(Se,) ee ; = -C.| Ch.(Ss) Ss A! s 


n 
(16) 


i 
Chax(Sw,) +| : c.| Chax(Sz), 
n-r 

where S,, and S, are the sum of products matrices due to the hypothesis and 
error respectively, and C, is the 100a% significance point of the distribution of 
the largest characteristic root, with df. u, s, and n — r. In this case, we have 
simultaneous confidence statements, similar to (6), given by 


4 
ChaxlS,a)a4) — l _ c.| Chax(Ss) S Al) 
(17) n r 


8 
=f 


s CrelBiand + [ 


4 
c, | Chax(Sz) ’ 


where A) = Cmax [$12)D7)O«a)], Ge) being a submatrix of @ obtained by choosing 
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some rows of . In addition, we have, by dropping some columns of 9, simul- 
taneous confidence statements given by 


4 
ChislSaym,] noe E = -C.| ChaslSeyal s A}, 


(18) 


8 


4 
c.| ChaxlSoys), 
oH 


s ChialScayae] + | 

where Ag) = Cmax [dD da], $ ) being a submatrix of @ obtained by choosing 

some columns of }, and Se”, and Sg,» are the corresponding submatrices of 

S,, and S,. (16), (17) and (18) are implications of (14.6.3) on p. 101 in [.3] 
In the case of treatment-block designs, we have 


meh: ay 
- 90, 


(19) Brn = HP + bi, j 
k . oe ae 


where x denotes the kth character measured on the ath experimental unit or 
individual that turns up for the ith treatment and the jth block; and ¢{", b” 
stand respectively for the contributions to the expectation of the kth variate 
made by the ith treatment and the jth block. 

From (1) and (12) we have the same “structure matrix”, A, in the multivariate 
situation as in the univariate case. This “‘structure matrix” depends on the design 
as well as on what the experimental statisticians have called the model, e.g., (7) 
and (19). 

In this set-up, so far as the hypothesis (13) is concerned, we shall take U = I 
for simplicity. 


3. Balanced incomplete block designs. 
(i) Univariate case. Here 


C= rl — tl(r- NL +l = Vn 24. 


Imposing the usual condition, J,t = 0, to get unique solutions, we have 
t = (k/dv)Q. Therefore, 6 = k/dv(L41, —Je-1.1)Q, and hence 


kK? e 
(20) D= Or (La, — JaadC(ha, — Jia)’ = = (Li + Je-1), 
whence 
(21) D* = (Ls - (1/0)J.4). 


Thus, using (11) and the relation J,é = 0, 
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Ww, i 
Te (1. rs J.) > 
nv ., 1% 
nw ., 
7 re 
dv ° ° 

= — zo &. 


t=l 


Then we can have a confidence statement of the form of (5) with n = bk, 
r=b+v-— 1,8 = v — 1 and A given by (22). 

For the “partial” statements (6), if di«) = (Oi, , big, *** 5 e,), (6 0 — 1) 
then, from (20), 


k 
(23) Dia) -_ vi (I, + J.). 
Hence 
: = dv 1 
(24) Dn = - (1 _— i+i1 5.) 


and 


Nw 1 
Aw = 7 He (1, - i+ i J.) a - 
For a symmetrical expression, we take $4) = (I;, —J:1)&«), where 


§4;(e) = ti, — (1/(t + 1) ){ts, + tis + _— + ti, + t,] 


so that, using J:i:Eq) = 0, 


l 


dv 
Aw = — Eve) (14 - i+ i 


Mw 
ong ;E Eva) Era) 


= wie Ei sce) + the |. 


(ii) Multivariate case. We have the confidence bounds of the form of (16) with 
n= bkyr = b+0-—1,8 = v — land 


A = Cusz E * (1. - * J.) |. 


Here again we may write @ = (I, —Je-1:)&, where & = (E°,--- , &™) and 


Juss) Bw 
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gf? =  — > t})”. Then A = (Av/k) Cmax (€’E). We have, from (24), one 
set of “partial” statements of the form of (17) with 


a(¢) = * Cmax |e. (1 a me J.) | , 


or, from (25), Ag) = (Av/k) Cmax [E(aEca], Where Ee) = (EI, --- , EE)). 
Similarly, we have, from (21), another set of “partial’’ statements of the form 
of (18) with 


Aw = Crnax | # * (1. - + 3,1) 60 | = 7 Cras Ee Ew»). 


4. Partially balanced incomplete block designs. 

(i) Univariate case. Consider a PBIBD with m associate classes and associ- 
ation matrices B; (i = 0, 1, --- , m). Thenit iswell known[1] thatC = }-foaB,, 
where By = I, , ao = r(k — 1)/k, a; = —d,/k, i= 1, --- , m; and, imposing the 
condition J,;,t = 0 on (9), we have 


t= (Sa B:)Q = EQ, say. 


It is well known that, when the design is connected, Rank C = v — 1, so that 
the condition J,,,t = 0 is sufficient to give unique solutions. Further 


(26) Jie = 0 and JieQ = 0. 


Let 
c =(%)" 71-0 = (G9), 
vXv _ 


E = (ey 1) = Ge = ( f). 
Then (2) t= (>) , where Q’ = (Q), Q,). Hence 


t = EQ = (Ei,e) (2) = £0 + eQ. 


Therefore, in view of (26), 


t = E:Q: — eJie-1Q: = (Et — Ji e-1)Qi = (Ei — €Ja-1'x) C- 


Hence 


(27) aT = (Ej; — eJi.-1'x), 
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so that, 
, ix f 
(E; — eJi»-1;x) =I,. 


lve 
Thus{E,{C, — @JieiC; + xj: = I,. Hence, in view of (26), 


E.C, + eC’ = 1, - XJi-s ’ 
that is, 


(28) EC = I, — xJis. 
Also, from (27), 


(7) (Ei — eiaix) = 1. 


Hence Cyx = 0. But C,J,, = O and Rank C, = »v — 1. Therefore,x = zJ,,. 


Furthermore, J;.x = 1, whence J.J... = 1, that is,z = v’. (28) thus reduces 
to 


(29) EC = I, — (1/v)J.. 
Now é = (L., —Jaa)t = (Li, —Je-1) EQ, so that, 


D = (Ls, —JnssBCE(_ 5“ ). 


Therefore, from (29) and (27), 
D _ (I,-1 ’ —Je-11) (1. -_ 3.) (Ej Yagi eJi.»—1) 


= (La, —Jeais) (Ei — eJi es) 


(30) = Ey — fJiw—1 —Je-i31 f’ + €o Je~s . 
Furthermore, premultiplying both sides of the equation, 


En — tJiv-ai Jo-t.8 [ Ci: ‘] ie 


f — € Jia : Siva 1 


by (Li, —Je-i4), we have DC,, = L_, and, therefore, 
(31) D* = Cy. 

Hence, from (11), 

(32) A= $'Cud = &CE. 


Here, we may note that ¢;; = ap = r(k — 1)/k and ej = a; = —,/k if ith and 
jth treatments are Ith associates. Then we can have a confidence statement of the 
form of (5) withn = bk,r = b+” — 1,8 = v — 1 and A given by (32). 
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The “partial” statements of the form of (6), however, cannot be made in a 
compact form, unless we know the association scheme. If we have 6.) = 
(¢1, -*+ ,@), then Du) = X — YW “Z, where 


xX Y t 
cu = [7 =| v-—-t—1 


t »-t-—l 


and thus Ay) = $()[X ~ YW ‘Z)}«) . 
(ii) Multivariate case. We have the confidence bounds of the form of (16) with 
n= bk,r=b+v-—1,8 =v — 1 and 


A = Coax [$’Cud] = Cmax [ECE]. 
We have, as before, one set of “partial” statements of the form of (17) with 
ia) = Cmax [%0)(X — YW'Z) Oxe)]. 
The other set of “partial” statements is of the form of (18) with 
Aw = Cmax [60)Cude] = Cmax [ECE]. 


5. General “connected” incomplete block designs. It is well known [2] that» 
in general, Ct = Q, which, on imposing the condition J,,t = 0, yields t = EQ. 
Then, arguing as before, from (26) to (32), we have 


(33) A= o’Cud = E’CE. 


Then we can have a confidence statement of the form of (5) with 


r 6 
n=) r, = >. k;, r=b+v-1, s=v-1 


tml j=l 


and A given by (33). We can have “partial” statements and confidence bounds 
jn the multivariate situation analogous to those for PBIBD. 


6. Acknowledgment. I am indebted to Professor 8S. N. Roy for suggesting this 
problem and for suggesting improvements. 
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A CLASS OF FACTORIAL DESIGNS WITH UNEQUAL 
CELL-FREQUENCIES 


By Gimpgon Scuwarz' 
Columbia University 

1. Summary. A class of multifactorial designs are defined and analyzed. The 
designs considered have each a total number of observations that can not be 
divided equally among the cells of the designs; however, by distributing the 
observations in a way that is in a certain sense symmetrical, the equations that 
determine the least squares estimates of the linear parameters become explicitly 
solvable. 

The case of two non-interacting factors with arbitrary numbers of levels is 
treated first. In the n-factor case we have to restrict ourselves to factors having 
equal numbers of levels. After defining the designs, the estimates are computed. 


Some general discussions of the symmetries and algebraic properties involved 
conclude the paper. 


2. Introduction. The first case to be considered is that of two non-interacting 
factors, with J and J levels respectively. For each pair i, j of levels the measured 
magnitude has an expected value 7,;. We assume that the ;; can be expressed 
in terms of J + J + 1 parameters {y, a; , 8;} by the equations 


(1) nj = peta t+ B;, a= . = GQ. 


The dot indicates as usual summation over the range of the index it replaces. 

Denoting by y; the kth measurement in the cell in which the factors A and B 
are applied at levels i and j respectively, we assume the y; to be normal inde- 
pendent random variables with means 9;; and common variance o’. 

The experimenter is free to choose the number n;,; of observations in each 
cell. The choice of the matrix n,; may be influenced by three requirements; first, 
the cost of experimentation makes an unnecessarily large number of observa- 
tions undesirable; second, for a given number n.. of observations, different ways 
of dividing this number among the different cells will result in different patterns 
of information about the parameters, and unless specific conditions about some 
of the levels are added, the design will be the closer to optimal the more evenly 
the number n.. is distributed among the cells; and last, it is impossible to write 
simple explicit formulas for the least-squares-estimates that hold for general 
n;; , While for some classes of n,;-matrices, such formulae can be found. 

Considering the two last requirements only, we are led to a well known class 
of designs, namely those in which all the n,; are equal, say to n. 
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As we have n.. = nlJ for this class, we cannot regulate the total number of 
experiments except in jumps of JJ; in many cases this may lead to a violation 
of the first requirement. Consider for example a case in which one observation 
per cell would suffice for estimation of the parameters yu, a; , 8; , while for the 
estimation of o’, we would want a few additional observations in some of the 
cells. Within the class of constant n,;, this can be achieved only by doubling 
the total number of experiments. 

There have been various attempts of considering special designs with unequal 
frequencies (Cf. References). Among the special cases treated by Daniel [2] 
and, in private communications with Daniel, by A. Birnbaum and Scheffé, there 
were designs with some symmetry properties. It was Birnbaum’s suggestion to 
look for a more general class of designs that led to the results described in this 
paper. 


3. Definition of S and Calculation of the Estimates. Let us proceed now to 
define the class S. We start out with d by d unit matrix, d being any common 
divisor of I and J, and change it into an J by J matrix by replacing each of its 
“one” entries by a I/d by J/d matrix of ones, and each of its zeros by a similar 
matrix of zeros. This way we define a matrix 


x 3 
1 
1 


| 


E | 
Phseip neo 131, oo) 

Denoting this matrix by A;,,,4, or for short A;,; , we can now define S as the 
class of all designs with matrices (n,;;) that can be written either in the form 
(n) + Az, or (n) — Az, for some positive integer n, and d, a divisor of n, 
where (n) denotes the J by J matrix having all entries equal to n. We claim, (a) 
the number n.. runs in the class S over all integers of the form 


IJ(n+d"); 


and (b) there is a simple explicit formula for the least-squares estimates that 
holds for all the designs in S. 

(a) becomes evident if we observe that A;,, has 7J/d non-zero entries, and 
we shall prove (b) by arriving at the formulae, first for the minus sign and then 
in general. 

The least squares estimates of the row effects can be obtained from the numbers 


(2) a; = y../ny. — y.../n.., 
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which span uniquely the estimation space restricted by the side conditions. 
Each a; is a unique linear combination of the least squares estimates 4; , 8, 
given by 

(3) a; = & + [1/J(n + d")}D* B;, 


where }>->* denotes summation over the cells with n + 1 observations only. 
Using vector notation a = (a, ---, az), ete., 


(4) a = & + (d/J(nd + 1)]} Ax, 
and, by interchanging rows and columns, 
(5) b =8 + [d/I(nd + 1)] Ayré. 


To eliminate § from equations (4) and (5), we subtract from (4) a suitable 
multiple of (5). Using the equation 


(6) AAs = (J/d)Arx 
which follows easily from the definition of A,,; , we arrive at 
(7) a — [d/J(nd + 1)JArsb = & — [d/I(nd + 1)JAnd. 


In order to solve this equation, we have to invert a matrix which can be written, 
if we denote the unit matrix by U;,;, as Urn — [d/I(nd + 1)*|Ar,. 

We can find the required inverse by finding the value of ¢ that makes the 
product 


(8) (Ur, — (d/I(nd + 1)*\Arr) (Un + tAn) 


equal to the unit matrix. Reducing the Aj, term by applying (6) we obtain 
t = 1/In(nd + 2). Having found the inverse we can now solve equation (7). 
Denoting by R and C vectors of row and column-sums, respectively, and by S 
a vector with J components, all equal to the grand total y... , we have 


& = [d/J(nd + 1))R + [d/l Jn(nd + 1)(nd + 2))AnR 


— [d/IJn(nd + 2)JAnsC — [d/IJ (nd + 2)]S. 
The corresponding formula for 6 is easily obtained by interchanging R and C, 
as well as J and J. The estimate of yu is obviously equal to (d/IJ(nd — 1))y---, 
the mean of all observations. The change in the formulae for the case (n,;) = 
(n) — Ar,z, will consist of changing the signs of d, and of all the matrices. 
Merging both cases into one, and denoting by S also a vector with J components 
all equal to y---, we have finally 
& = [d/J(nd + 1)|JR + [d/IJn(nd + 1)(nd + 2)AnR 
+ [d/IJn(nd + 2)|Ar,C — [d/IJ (nd + 2)j8, 
B = [d/I(nd + 1))C + [d/JIn(nd + 1)(nd + 2)JA;,C 
F (d/JIn(nd + 2))AyR — |d/JI(nd + 2))8, 


(9) 


(10) 


(11) 


(12) g& = [d/IJ(nd + 1)]S. 
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As a final remark, let us note that our definitions and formulae are valid as 
long as n is at least 1, and d is at least 2, with the exception of the case n = 1, 
d = 2, in which nd — 2 equals zero, and the n — d™ replicate is not sufficient 
for estimation of the parameters. On the other hand, as for d 2 3, n = 1 the 
formulae remain meaningful also for the lower sign, certain designs with some 
empty cells are included in the class considered here. 

Most of what has been done in the preceeding section admits a rather straight- 
forward generalization to the case of q factors acting additively, that is, with 
no interactions of any order. The only step that is not generalized so easily is 
the reduction of equations (4) and (5), each involving both row-effect estimates 
and column effect estimates, to equation (7), which isolates the row effects. In 
order to make possible an explicit solution to the analogous problem in the case 
of many factors, we have to restrict our considerations to designs having an 
equal number of levels for every factor. Denoting the effect of the hth factor at 
its ith level by aj~) , we define our model by the equations 


(13) Vir = ut » ign) + ix 


where i denotes the vector (i(1), --- i(q)), andk = 1, 2, ---,m,,; , with the error 
terms distributed as usual. About the parameters we assume )_; aia) = 0, 
h =1,2,---q. 

In order to determine the number of observations in each cell, we choose a 
divisor d of the number of levels 7, and construct a q-dimensional hyper-cube of 
side-length d. Putting d ones at the grid points along the q-space-diagonal of 
the hyper-cube, and zeros at the other grid points, we obtain the g-dimensional 
analogue of the d by d unit matrix. Replacing each (q — 1) dimensional layer 
by I/d identical layers, an array of J* points is obtained, [*/d* of which carry 
units. If we start out with an J*-design having n observations in each cell, and 
add +1 observation to each cell that corresponds to a unit in the array, an 
n + 1/d*”* duplicate will be obtained. 

Defining the numbers a;(h) as the average of the observations in the layer 
determined by a given level of a given factor, minus the average of all observa- 
tions, we get a system of vector equations; 


(14) a(1) = &(1) + gAr&(2) + gArn&(3) + --- + gAnd(q) 
a(2) gArr&(1) + &(2) + gAnd(3) + --- + gAnd(q) 


a(q) gA11&(1) 7 * coe gAnrd(q — 1) + &(q) 


where g = [d*'/I*"(nd** + 1)](1/d)** = d/I(nd** + 1). The first factor 
in g is the reciprocal of the number of observations per (n — 1) — dimensional 
partial design. The second factor is the number of higher populated cells that 
two levels of different factors have in common. 

For the solution of (14) inversion of a g by g matrix having 7 by J matrices 
as elements (aq by g by J by / tensor of the fourth degree) is required. We start 
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by considering the matrix G,, obtained by replacing A,; in the tensor of (14) by 
a scalar variable x. Putting for its inverse 


v1 
Ga = 


we find 

y = gx/((q — 1)g'x* — (q — 2)gx — 1, 

z= —(q — 2)gx — 1/[((q — 1)g'x" — (q — 2)gx — 1). 

We can write this result in a form that does not involve any fractions 

Geel ((q — 1)g°x* — (q — 2)gx — 1)GQ) 
= ((q — 1)¢'2* — (g — 2)gz — 1)U 4, 

where the expression in the square brackets equals a qg by q matrix with 
—(q — 2)gx — 1 along the diagonal, and gz in the other places. Having dis- 
posed of fractions, we can now substitute A,,; for z. Carrying out the substitu- 
tion in equation (15), G,, becomes the tensor of (14), and the expression in the 


square brackets becomes a tensor having —(q — 2)gA;; — U1, along its diagonal, 
and gA,; in the other places. 


Applying all this to (14), we arrive at the reduced equations, 


(15) 


((q — 1)@*Air — (q — 2)9An — Un) &(1) 
= —[(q — 2)gAn — Unja(1) + gAria(2) + ---) 
We can now proceed as in the two-factor case, and get, putting N for nd*", 
&(1) = d**/{I""(N + 1)]S(1) 


(16) 


¥ d*/{I'N(N — q)Ar{S(2) + --- S(q)] — d*“/I(N 4@Q)S. 


As in the 2-factor case, the lower signs serve for the n — 1/d*” duplicate. As 
for g 2 3 we have N = nd** > q, the denominators never vanish except in 
the case mentioned before when g = 2 and d = 2, and the formulae are valid 
unrestrictedly. By permuting the factors, estimates for the other factors can 
be easily obtained. 


4. General Symmetrical Designs. In this section we shall examine closer the 
symmetry properties that the designs treated in this paper have in common. 

The various symmetry properties of the designs having equal numbers of ob- 
servations in all cells are implied by the invariance of these designs under all 
permutations of the levels of any factors; furthermore, those designs, the ‘full 
multiple replicates’’, are the only ones left invariant by all permutations. Cer- 
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tainly, invariance under all permutations assures us of equal treatment of all 
levels of each factor. It implies an even stronger property: different ordered pairs 
of levels will enter the design similarly, as will any different ordered n-tuples. 
This additional property is certainly welcome. Some of the questions the designed 
experiment might be called upon to answer do involve pairs or other sets of levels, 
and it would be natural to expect symmetric treatment of these questions as 
well. However, we know that we have to give up some requirements if we want 
to include fractional replicates, and it is this “symmetry of subsets” that we 
choose to sacrifice. 

Let us examine the freedom gained by requiring symmetry with regard to 
single levels only, by looking at the two-factor case. In this case, the design is 
determined by a matrix having the cell frequencies as its entries. Applying single- 
level symmetry to the row factor, we find that the rows of the matrix have to be 
equal to each other; however, as the order of entries in a row is determined by the 
order of levels of the column factor, the order in a row is immaterial, and the 
word “equal” should be read “differing only by a permutation of their elements’’. 
Similar “equality” is implied for the columns of the matrix. Any unit matrix can 
now serve as an example of a matrix having the required properties and yet not 
belonging to the full replicate designs. 

Returning to the multifactorial designs, we arrive at the following formulation 
of our symmetry requirements: 

Derinition: A design is called “symmetrical with regard to single levels”, or 
from here on, for short, ‘“‘symmetrical’’, if the two partial designs resulting from 
fixing any one of the factors at two different levels, can be transformed one into 
the other by permuting the levels of the other factors. 

We now restrict our class of designs even further, by introducing a restriction 
that is not motivated solely by considerations of symmetry. The designs we 
shall consider will all have only two different numbers of observations per cell 
occurring in their cells, furthermore, those two numbers will differ from each 
other by one. We justify this restriction by the following “optimality argument”’: 
the definition of a symmetrical design implies that the different cell frequencies 
appearing in partial designs belonging to different levels of the same factor, will 
be the same, possibly differently arranged. If there were two cells in the design 
whose numbers of observations differed by more than one, we would find two 
such cells in every subdesign and a new design could be defined by decreasing 
by one the number of observations in the higher populated cell and increasing 
it in the other. The resulting design would still be symmetrical and have the same 
total number of observations as the original design. As whenever the given total 
number of observations makes equally populated cells possible, the fully repli- 
cated design is in some sense optimal, we can interpret the above restriction as 
an attempt to avoid unnecessary deviations from optimality. 

Having narrowed down the class of designs, we can now turn to the last re- 
quirement: existence of explicit estimation formulae. 

The fact that permitted us to look for an inverse of a linear polynomial in 
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Ay among the set of linear polynomials in A,, is the degree of minimal poly- 
nomial of A,, : it is quadratic. In general, the inverse of any regular matrix P 
that is a polynomial P(A), where A has a minimal polynomial of degree r, can 
be written as Q(A), Q being of degree r — 1 at most. 

Proor: The set of all such Q(A) is a ring and in this ring the ideal generated 
by P(A) must be the whole ring, otherwise it would be of lower dimension and 
P(A) would be singular. Therefore, P(A) has an inverse among the Q(A). 

In general for a I by J matrix r can be any number from 1 to 7. As we have to 
find r constants in order to invert the matrix, the inversion can be done simply 
only for low r. The class S can be characterized as the class of matrices having 
the symmetries and optimality properties mentioned above, and a minimal 
polynomial of degree r = 2. 
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A GENERALIZATION OF GROUP DIVISIBLE DESIGNS 


By Damarasc RAGHAVARAO 
University of Bombay 


1. Summary and Introduction. Roy [8] extended the idea of Group Divisible 
designs of Bose and Connor [1] to m-associate classes, calling such designs Hier- 
archical Group Divisible designs with m-associate classes. Subsequently, no 
literature is found in this direction. The purpose of this paper is to study these 
designs systematically. A compact definition of the design, under the name 
Group Divisible m-associate (GD m-associate) design is given in Section 2. In 
the same section the parameters of the design are obtained in a slightly different 
form than that of Roy. The uniqueness of the association scheme from the 
parameters is shown in Section 3. The designs are divided into (m + 1) classes 
in Section 4. Some interesting combinatorial properties are obtained in Section 
5. The necessary conditions for the existence of a class of these designs are ob- 
tained in Section 7. Finally, some numerical illustrations of these designs are 
given in the Appendix. 


2. Definition and Parameters of a Group Divisible m-associate Design. 

Dertnition 2.1. A Group Divisible m-associate design may be defined as 
follows: 

(i) The experimental material is divided into b blocks of k units each, different 
treatments being applied to the units in the same block. 

(ii) There are v = N,N, --- N,, treatments denoted by 


ps ‘ y ° ‘ are - 
Vis igeevig (ta = 1,2, -+- , Ni 5 te = 1,2,-°--, No5 °° 


Each treatment occurs once in each of the r blocks. 

(iii) There can be established a relation of association between any two treat- 
ments satisfying the following requirements: 

(a) Two treatments having only the first (m — j) suffixes of »,,¥,...;,, the 
same are the jth associates (j = 1,2, --- , m). 

(b) Each treatment has exactly n;, jth associates. 

(c) Given any two treatments which are ith associates, the number of treat- 
ments common to the jth associates of the first and the kth associates of the 
second is p}, and is independent of the pair of treatments with which we start. 
Also, pjx = pes(t, j, kK = 1, 2,-+-, m). 

(iv) Two treatments which are jth associates occur together in \,; blocks. 

The numbers b, r, k, Ni, Ne, --+ , Nm, 1, A2,***, Am are known as the 
parameters of the GD m-associate design. We can easily see that 


(2.1) n= N,N mi ove N m—i42(N mix: = 3}. j= 1, 2, ooo MH; 
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(pix) = 
Dm isay xin i+1) 
Opm—i) xs) 


where 0, is a null matrix of the order i X 7’; 2;, is the (¢ — 1)th order column 
. , . 
vector with elements n,, Me, --- , Nis ; Te. is the transpose of z,, ; and 
D (m~i41) Xm +1) 


is the diagonal matrix with elements NwNwi-+: Nm—isel( Nm 
Nisa, *** , Nm. The parameters satisfy the relations 


— 2) 


i+] < 


NiN2--- Nar = bk; >, ne = Ni N2--:- Na — 1; 


a=l 


> nere = r(k — 1); 


a= 


m 
: 3 k ‘ i) ie ‘ 
Nidik = Nj Dik = Ne Pi; ; 2 Pin =~ 8.3, é, J, & = coe coe 


where 4,; is the Kronecker delta taking the value 1 or 0 according as i = 7 or 
i + j. Since the parameters satisfy the above relations, it can be seen that a 
GD m-associate design is a special case of Partially Balanced Incomplete Block 
Designs defined by Bose and Nair [2]. 


3. Uniqueness of the Association Scheme. This section shows that the rela- 
tions (2.1) and (2.2) imply the association scheme iii(a). In this section, we call 
a group of treatments which are first associates a first-associate group; a group 
of first-associate groups a second-associate group, etc. Let @ be any treatment. 
Let 0{”, 03°, ---, OS) be its ith associates (¢ = 1, 2,---, m). Consider the 
treatments @ and @{". Since n, = N,, — 1 and Pir = N,, — 2, the first associates 


of 6\" except @ are the same as the first associates of 6 except 6;”. Also, as 
pis = O(4 = 2,3, ---,m), 


we can divide the treatments into first-associate groups such that treatments in 
different first-associate groups are 2nd, 3rd, --- , or mth associates. It can be 
seen that each first-associate group contains N,, treatments. Thus the v treat- 
ments are divided into N,N. --- N»_; first-associate groups of N,, treatments 
each. 

Now, consider the treatments @ and 6{”’. Since 


ni = pis = NaNaia +++ Naige(Nwins — 1), 


it is obvious that the ith associates of @ and 6{” are the same (i = 3, 4, ---, m). 


Also, as pin = 0 and pix = Na(Nwa — 2), the NiN; --- Nis first-associate 
groups of the above paragraph can be subdivided into N,N; --- Na. second- 
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associate groups of NV, first-associate groups of N,, treatments each such that 
(i) treatments in different second-associate groups are 3rd, 4th, --- , or mth 
associates, and (ii) treatments in different first-associate groups of a second- 
associate group are the second associates. 

Again, consider the treatments @ and 6{”. Since 


n= Di _ NwANw1 eth N w~i42(N wins — 1), 


it can be seen that the ith associates of 6 and 6{” are the same (i = 4, 5, --- , m). 
Also, as pi: = 0 = pis = pis and pis = NaNmi(Nws — 2), the NiN2 --- News 
second-associate groups can be further grouped into NiN2--- Nw» third- 
associate groups each containing N»-2 second-associate groups. These second- 
associate groups contain N,,_, first-associate groups each containing N,, treat- 
ments. Treatments in different third-associate groups are 4th, 5th, --- , or mth 
associates. Treatments in different second-associate groups of a third-associate 
group are the third associates and treatments in different first-associate groups 
of a second-associate group are the second-associates. 

By similar reasoning, we finally obtain N,;, (m — 1)-associate groups of 
Nz, (m — 2)-associate groups, --- , of N,,.. first-associate groups of N,, treat- 
ments. The above grouping will be such that (i) treatments in different (m — 1)- 
associate groups are the mth associates, and (ii) treatments in different i-asso- 
ciate groups of an (i + 1)-associate group are the (¢ + 1)th associates 


(¢ = 1,2,---,m— 2). 


We can easily see that the above grouping of the treatments is the same as 
the association scheme iii(a). Hence the parameters (2.1) and (2.2) define the 
association scheme iii(a) uniquely and we have the following: 

TuHeoreM 3.1. The relations (2.1) and (2.2) for a Group Divisible m-associate 
design uniquely define the association scheme iii(a). 


4. Characterization of Group Divisible m-associate Designs. Let n,; = 1, if 
the ith treatment occurs in the jth block; and n,;; = 0, otherwise. Then the 
v X b matrix N = (n;;) is known as the incidence matrix of the GD m-associate 
design. From the definition of GD m-associate design, we can see that 


b 
i= 1,2,---,v; and D, Nine; m= Xs, As, °**, OF Aw 


j=l 


according as i and 7’ are Ist, 2nd, --- , or mth associates, i ~ 7’; i, 7’ 
- , v. Now, by suitably marking the treatments, we have 
| a ~ee 
(4.1) NIP =i ne Se °° Be 


Bis Man, 2+ Ml 
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where, at any stage, 


Bia Aina ++ Ain 


B; on on Bes ee i F i an 2, 3, a 


a re ai Bua 


A;= NB ais anise s+ Mm t= 2,3,--+,m; 
B, =f, A; = M, 


where Ey... ¥m-i42 ©: Nm» 18 OD Nw i+a2Nm iss *** Nath order square matrix 
with positive unit elements everywhere. The orders of NN’ and B; are N,N; --- 
Naw and NaisoN miss +++ Nw respectively (i = 2, 3,---, m). The matrices 
A, and B, are of unit order. Det (NN’) can be evaluated in the usual manner 
and we get 


(4.3) \NN’| oa rkPR pyeas-Y a PEN Na Fad 
where 


P; = ir - Am—iaa) + (i - Am —i41) M1 +--+ + 
(4.4) 


(Am—< - Aw—i41) Nes 9 i= 1, 2, a 


By replacing r by (r — z) in det (NN’) we can easily see that rk and P,’s 
(¢ = 1, 2,---, m) are the distinct characteristic roots of NN’. We know from 
the result of Connor and Clatworthy [4] that the characteristic roots of NN’ 
cannot be negative for an existing design. Thus we have the following theorem: 

THeoreM 4.1. A necessary condition for the existence of a Group Divisible m-asso- 
ciate design is that P; = 0 (4 = 1, 2,---, m). 

The designs with the following parameters violate the above necessary condi- 
tion and hence are impossible. The reason of impossibility is shown in brackets 
against the parameters. 

l.v=9=—br=9=k,N, = 3,N: = 15, N; = 2, 

M = 12,4 = 0,4 = 1 (Pi, Ps < 0). 

2. = 12,6 = 15,r=5,k = 4,N; = 2 = N;z,N; = 3, 

Ai = 0,2 = 3,4; = 1 (P; < 0). 
8,56 = 4,r = 3,k = 6, N; =2= N; = N;, 

A = 3,42. = 0,A, = 3 (P; < 0). 
16=-br=-5=k, MN; 2=N:,N; = 4, 

M = 0O,A. = 1,As = 2 (P, < 0). 

=6,k=4,N,=2=N,2N,2=N,, 
M = 0,4. = 1,45 = 0,X% = 2 (P; < 0). 
10, k = 5, Ni =2= N2 os N; = N, = Ns, 
Ms = 4, re = 0, A; = 1, = 0,45 = 2 
(P; < 0). 
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We can classify the existing designs mainly into (m + 1) classes as follows: 
(1) Singular GD m-associate designs characterized by P,, = 0; 
(2) P,, — regularGD m-associate designs characterized by P,, > 0, Pa = 0; 


(i) Pm, Pma, +++, Pm-ise — regular GD m-associate designs characterized 
by Pa > 0, Pat > 0, bee Poise > 0, Pais = 0; 


(m) Pa, Pmsi,++*, P2 — regular GD m-associate designs characterized 

by Pn > 0, Pat > 0,---, Pe > 0, Pi = 0; and 

(m+ 1) Regular GD mz-associate designs characterized by P; > 0 (i = 

1,2,-+-m). 

Excepting the last two classes, the other classes can be further divided; but, 
since this will be cumbersome, we do not do so. 


5. Some Combinatorial Properties of Group Divisible m-associate Designs. 
If P; = 0 = Piys(t = 1, 2,---, m— 1), we have Awins = Amine. Thus if 
P, =0 = P, =--- = Pa, thenr = \; = --- = \,, and the GD m-associate 
design reduces to an ordinary randomised block design. Hence, we have 

TueoreM 5.1. If, in a Group Divisible m-associate design, P; = 0 = P2 = 
-++ == P,,, then the design reduces to a randomized block design. 

Let j consecutive \’s (j = 2, 3,---,m— 1) of the GD m-associate design 
be equal. In this case we can see from the association scheme that the design 
reduces to a GD (m — j + 1)-associate design. The above result can be written 
in the form of the following theorem. 

TuroreM 5.2. If, in a Group Divisible m-associate design j consecutive )’s 
(j = 2, 3,-+-, m— 1) are equal, then the design reduces to a Group Divisible 
(m — j + 1)-associate design. 

We now prove another important theorem. 

TueroremM 5.3. Ina Pp, Pma,+*:, P2 — regular Group Divisible m-associate 
design k, is divisible by N,. Further, every block contains k/N, treatments of the 
JOrM Viig---ig (42 = 1,2, +++ , No 5 tg = 1,2, °°, Ma5+°+ 5am = 1,2,-°°- , Now) 
for any i (it = 1, 2,---, Ni). 

Proor. For any i (i = 1, 2,--- , Ni), let ej treatments of the form »,,,....,, 
(te = 1, 2,--:, Ne; ts = 1, 2,-->, Ns; -** 3 tn = 1, 2, °°: , New) Occur in 
the jth block (j = 1, 2, --- , b). Then, we have 

b 
> ej = NiNs--- Nar, 
(5.1) 4 “i 


d, ei(es — 1) = NiNg--- Nm(mrr + nade + +++ + NmarAw1), 
j= 


since each of the treatments occur in r blocks and every pair of treatments of the 
fOFM Viige-- ig (% = i. ee No jts = i, 2, eee Ns; 7 eee ° te = i, 2, coe » Naw) 
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occurs in Ay, Ax, -** , OF Am blocks. Using the property of P.., Pasa, -+*- 
P, — regular GD m-associate design and (5.1), we get 


b 
(5.2) D (ej)? = NiNG «+: Nadm 
j=l 


Let e§ = b> O, e} = k/N,. Then, 
(5.3) > (ej — e8)? = NiNG--- Nad — DK'/Ni = 0. 


Therefore, ¢} = e: = --- = es = e' = k/N,. Since e}(i = 1, 2,---, Mi; 
j = 1, 2,---, 6) must be integral, k is divisible by N,. Further e5 = k/N, 
(¢ = 1,2,---,Ni;7 = 1,2, ---, 6). This completes the proof of the theorem. 

The following P;, P2-regular GD 3-associate designs have a non-integral 
value for k/N, and hence are non-existing: 

lv= 12, b = 16, r = 4,k = 3, Ni = 2, Nz = 3, N; = 2, 

Ai = 2, r2 = 0, A; = |, 

2v0= 12, b = 16, r = 4,k = 3, Ni =2= N:,N; = 3, 

A = 1,A2. = 0,A5 = 1. 

3.v¥=12,6=9,r=3,k = 4,N, = 3, N; = N; = 2, 

A = 1, A. = 0,A; = 1. 
4. v = 20,6 = 32,r = 8,k = 5, N, = 2, Ns = 5, N; = 2, 
A = 4,A2 = 1,3 = 2. 

A GD mz-associate design is said to be symmetrical if b = v and in conse- 
quence r = k. Shrikhande [9] and Chowla and Ryser [5] have obtained conditions 
necessary for the existence of symmetrical balanced incomplete block designs. 
Bose and Connor have obtained necessary conditions for the existence of sym- 
metrical regular GD designs. We shall extend their results to symmetrical 
regular GD m-associate designs. With this in view, we give a brief resume of 
the important properties of the Legendre symbol, the Hilbert norm residue 
symbol and the Hasse-Minkowski invariant. 


6. Some known results about the Legendre symbol, the Hilbert norm residue 
symbol and the Hasse-Minkowski invariant. The Legendre symbol is defined as 
| {+1, if a is quadratic residue of p; 

(6.1) (a/p) = ¢ 


—1, if a is a non quadratic residue of p. 


A slight generalization of the Legendre symbol, is the Hilbert norm residue sym- 
bol (a, b),. If a and 6b are any non zero rational numbers, we define (a, b), to 
have the value +1 or —1 according as the congruence 


(6.2) ax’ + bn? = 1 (mod p ), 


has or has not for every value of r, rational solutions z, and y,. Here p is any 
prime including the conventional prime p, = ~. 


Many properties of (a, b), are given by Bruck and Ryser [3], Jones (6) and 
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Pall [7]. For further use, we reproduce the properties of (a, b), taken from the 
above references, in the form of the following theorems. 


Turorem 6.1. Jf m and m’ are integers not divisible by the odd prime p, then 
(6.3) (m, m’), = +1, 
(6.4) (m, Pp)» = (m/p). 
Moreover, if m= m' # 0 (mod p), then 
(6.5) (m, P)p = (m’', p)p. 
THEeoreM 6.2. For arbitrary non-zero integers m, m’, n,n’, and for every prime p, 
(6.6) (— m,m), = +1, 
(6.7) (m,n)» = (n,m),, 
(6.8) (mm',n), = (m,n),(m’,n)>, 
(6.9) (m, nn’), = (m,n),(m,n’),, 


(6.10) (mm’,m — m'), = (m, — m’),, 


(6.11) I] G5 + Ve = ((m+ 1)!,-Dp, 
dM 


and 
(6.12) (as, b)» = (a, b)». 


Now, let A = (a;;) be any n X n symmetric matrix with rational elements. 
The matrix B is said to be rationally congruent to A, written A ~ B, provided 
there exists a non-singular matrix C with rational elements, such that A = CBC’, 
where C’ is the transpose of C. If D; (¢ = 1, 2, --- , n) denotes the leading prin- 
cipal minor determinant of order 7 in the matrix A, then if none of the D; van- 
ishes, the quantity 


n—1l 


(6.13) C, = C,(A) = (—1, —D,), [] (Di, — Diss)>, 


t=1 
is invariant for all matrices rationally congruent to A. C,(A) defined above is 
known as the Hasse-Minkowski invariant. 
The following lemmas regarding C, will be useful. 


Lemma 6.1. If d is a rational number and A, = dl», where Im is the identity 
matrix of order m, then 


ema Cyo(Am) = (—1, —1)9(d, — 1)”. 


Lemma 6.2. If A and B are symmetric matrices with rational elements and 
U = A + B, is the direct sum of A and B, then 


(6.15) C,(U) = (—1, —1)9C,(A)C,(B) (\Al, |Bl)>. 





GROUP DIVISIBLE DESIGNS 763 


7. Necessary conditions for the existence of Symmetrical Regular Grou 
Divisible m-associate Designs. Since the design is a symmetric one, det(NN ) 
is a perfect square (cf. Connor and Clatworthy, and Shrikhande). Thus 

P¥i7 pyoa—y mpi PENN m1 Km) 


is a perfect square. This result can be written in the form of the following theo- 
rem. 

THEOREM 7.1. A necessary condition for the existence of a regular symmetrical 
Group Divisible m-associate design is that P{** Py?” --. PRE Na-1and 
is a perfect square. 

The designs with the following parameters do not satisfy the above theorem 
and hence are impossible. 

lv=24=)Dr 6 = k,N, = 4,N2 = 2,N; = 

A = 3, 2 = 2, As 

2.0=32=b,r 7 =k, N, = 4,N: = 2,N; 

A = 2, Az = 3, As 
30 = br 7 = k, N, = 5, N2 = 3, Ns 
A = 2, 2 = 4, Xs 
30 = = k, Ni = 3, N: = 5,N; = 
Mi = 4,2 = 6,A3 = ° 
= k, N, = 3 = N2 = N;, Nz = 2, 
Ai = 6, Ae = 5,3 = 4, Xs, = is 


’ 


Qi = By — Aj, 


where B,’s and A,’s are as defined in Section 4. Det(Q,)i = 2, 3, --- 
be found easily, and we have 


Qi) = ((r — Aa) + An — Aa) + ee & Ona — Ay) ni!) 


f(r — Ava) + (Aa — Aca) Hee Oe — Aca) eg 


Na~is24 m— Neu Nenl 
fr — ds} eters 48 m—1 Nm) 


Now, let us calculate the Hasse-Minkowski invariant of (N.N’) for odd primes 
using the method of Bose and Connor. Taking the direct sum with —A,, , NN’ 
becomes 


(7.3) (NN’), = baal sz |. 


Therefore, from Lemma 6.2, 


(7.4) C,(NN’); = C>(NN’) (Am, — 1)5- 
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— 
Qn 
(NN’),; ~ 


® iL 
L L we L — mn 


where L is an N,N; --- Nth order column vector with —\,, everywhere. Hence 

(7.6) Cy(NN’): = {Co(Qu)}*"(| Qm |, — 1) ?(Am, — | Qu |*")e- 

Equating (7.4) and (7.6), we get 

(7.7) Cy(NN’) = {Cp(Qm)}""(| Qu], — 1) 9? (Xm, | Qn D5": 

C,(Q;)t = 2,3, «++ , m can be calculated in a similar way as above and we get 

Cy(Qz) = (r — ar, — 1) 39% P(X, — ae a)” 

(7.8) (| Qe |, 7 — Ar)57(| Qe |, Aa — Az) p, 
C(Qi) = (Cp(Qi-1)}*"-**9(| Qi |, — 1) Zen tet n-s49 te 

(7.9) (Nea — Ac, | Qe |)p(Aca — Ac, | Qua |)” -*** 


(| Qs |, | Qe-a |)%9>**?, i = 3,4,---,m. 


Equation (7.9) is a recurrence relation. This equation with the help of (7.2) 
and (7.8) finally gives C,(Q,,). Substituting this value of C,(Q,.) in (7.7), 
C,(NN’) can be calculated. Now, since J, = N-'(NN’)(N’)™", I, ~ NN’. 
Therefore, 


(7.10) C,(NN’) = C,(1,) = (—1, — 1), = +1. 


Thus we have the following theorem 

TuHeoremM 7.2. A necessary condition for the existence of a symmetrical regular 
Group Divisible m-associate design is that C,(NN’) = +1, for odd primes p where 
C,(NN’) is calculated from (7.2), (7.8), (7.9) and (7.7). 

When there are only three associate classes the above calculations can be 
simplified and the corollary follows: 

Corouuary 7.2.1. A necessary condition for the existence of a regular sym- 
metrical Group Divisible 3-associate design is that 


(NyNoN3(Ny+Nq4Nq4+3)—N 1 Nq(N1+NQ))/2 
ifae = 35 1 273 your 2 (As, Pi)» 


te sas . (Pi, oa is Perr, . = y pM al¥aCis +h) +ls—0) rn +2) 8 
‘. . 7 oo 
* (Xo — As, P,P2)3'(u — Ds, P,P,)5*" (P, ,Pd5*”* 


Pp 


- (Py, P2)3'"*(P,, P32” = +1, for all odd primes p. 
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ILLUSTRATION 7.2.1. Consider the GD 3-associate design with the parameters 
v=27=br=7=k,Ni =3 = N. = Nz, = 6,2 = 2,A; = 1. 
The left hand side of (7.11) is 
(22, 13), = (13, 2),(13, 11), = —1, when p = 11. 


Thus the corollary 7.2.1 is not satisfied and the design is impossible. 
ILLUSTRATION 7.2.2. For the GD 3-associate design with the parameters 


v= 48 = b,r = 10 = k, Ni = 6, Nz = 4, N; = 2, 
Ai = 4, Ae 
the left hand side of (7.11) is 


(12, — 1), = (3, — 1), = — 1, for p = 3. 


The Corollary 7.2.1 is not satisfied and the design is impossible. 
By applying the Corollary 7.2.1, it can be easily verified that the following 
designs are non-existing: 
ay v= 24=br=9=k,Ni = 2=N:,N,;=6,\.=6, = 1,A3 = 
20=24= br= 10 = k, Ni = 2 => N:,N; = 6, Ay = 6, A: = 2, A; 
3. vu = 24 b,r = 10 = k, N; = 6,N2 = 2 = Nz, = 6, A2 = 2,A; 
s 


. = 40 = b, r= 13 = k, N, = 10, N2=2=<=WN;, ys = 10, re 
A; = 4. 


8. Acknowledgment. My sincere thanks are due to Professor M. C. Chakra- 
barti for his kind guidance. 
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APPENDIX 


Here we give some numerical constructions in the useful range r, k < 10. 
For convenience we denote the treatment v; by (ijk) in the following examples. 

lvu=8=b,r=3=k, Ni=2=N,:=N;, 4 =2, %&¥ =0, 3 = 1. 
Taking the treatments as 


(111) (112) (211) (212) 
(121) (122) (221) (222) 


the plan of the design is 


((111) (112) (211)] 
((112) (111) | (212)] 
[(121) (122) (221)) 
{(122) (121) (222)] 
((211) | (212) | (121)) 
((212) | (211) | (122)) 
ai) | 60 a8) «CU]lClC CK) 
[((222) | (221) (112)] 
Reps. I II Ill 


2e0=8=), r=4=k, Ni = = N= N;, 4y = 2, += 1, 3=2 
Taking the treatments as in the above example, the plan of the design is 


a 


((111) (112) (211) (221)} 
{(112) (111) (212) (222)] 
((121) (122) (221) (212)] 
[(122) (121) (222) (211)} 
[(222) (221) (111) (121)} 
[(221) (222) (112) (122)} 
[(211) (212) (122 (111)] 
[(212) (211) (121) (112)] 
Reps. I II Ill IV 


30-= 8, b= 24, r= 9, k= 3, N, =2= N2 = N;, A = 4, Az = 1, A; = 3. 
Taking the treatments as in Example 1, the plan of the design is 
Reps. 


[(111) (112) (211)] 
((112) | = (111) (212)] 
[(121) (122) (221)] 
[(122) (222)] 
((211) 212) (121)] 
[(212) ‘ (122)] 
[(221) (22° (111)] 
[(222) 2: (112)] 


I, I, 1 
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(lu) =| (112) 
[(112) (111) 
(421) | (122) 
((122) | (121) 


(221)] 
(222)] 
(211)) 
(212) IV, V, VI 
(212) | (211) 
[(221) (222) 
[(222) (221) 


(112)] 
(121)] 
(122)) 


(111) (121) 
(122) | (411) 
((i21) | (112) 
((112) (122) 
((211) | (221) (112)] 
[(222) (211) (122)] 
((221) | (212) (121)] 
((212) (222) | (111)]) 
Reps. 1,011,111 IV,V,VI — VII, VII, 1x 


4.v=8 = }b, r= § = k, N,=2=N,=N;, a = 4, Ae = 2, As = 3. 
Taking the treatments as in Example 1, the plan of the design is 


(111) (112) | (211) (221) (222)) 
(112) (111) (212) (222) (221)] 
(121) (122) (222) (211) (212)) 
[(122) (121) (221) (212) (211)) 
[(211) (212) (121) (111) (112)) 
[(212) (211) (122) (112) (111)] 


(211)) 
(221)] 
(222)] 


(212)] VII, VIII, IX 


(211) | (212) (111)] 
| 
| 


[(221) (222) (111) (121) (122)} 
[(222) (221) (112) (122) (121)] 
Reps. I II Ill IV V 


5.» = 8 = 8, r= 6 = k, Ni, =2=N,=N;, M = 4, = 5 
Taking the treatments as in Example 1, the plan of the design is 


((411) | (112) | (121) | (122) | (211) | (221) 
((112) | (121) | (122) | (411) | (222) | (211))} 
[(121) | (122) | (111) | (112) | (221) | (212)) 
((122) | (111) | (112) | (121) | (212) | (222)) 
((211) | (212) | (221) | (222) | (411) | (121))} 
((212) | (221) | (222) | (211) | (122) | (111)} 
[(221) | (222) | (211) | (212) | (112) | (122)) 
[(222) | (211) | (212) | (221) | (121) | (112)] 
Ill IV Vv VI 


k, Ny = 2 = Nz, Ns = 3, \y = 3, Ae = O, As = 1. 
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Taking the treatments as 


(111) = (112) _~—s (1118) (211) (212) (213) 
(121) (122) (128) (221) (222) (228) 
the plan of the design is 
(111) (112) (113) (211)] 
(112) (113) (111) (212)) 
(113) (111) (112) (213)] 
[(121) (122) (123) (221)] 
(122) (123) (121) (222)} 
[(123) (121) (122) (223)] 
[(211) (212) (213) (121)] 
[(212) (213) (211) (122)] 
[(213) (211) (212) (123)] 
{(221) (222) (223) (111)] 
[(222) (223) (221) (112)) 
[(223) (221) (222) (113)) 
Reps. I II Ill IV 


70um16=b, r=4=k, Ni =2=N:=N;=N,, 4 =0, 3 =2 
A; = 0, Ay = 1. Taking the treatments as 


(1111) ~— (1112) (1211) (1212) 
(1121) = (1122) (1221) (1222) 
(2111) (2112) (2211) (2212) 
(2121) (2122) (2221) (2222) 

the plan of the design is 

{(1111) (1121) (2111) (2121)] 
{(1121) (1111) (2112) (2122)] 
{(2211) (2221) (1111) (1122)} 
{(2212) (2222) (1122) (1111))} 
{(2221) (2211) (1112) (1121)} 
{(2222) (2212) (1121) (1112)] 
{(1112) (1122) | (2121) (2111)] 
{(1122) (1112) (2122) (2112)] 
{(2111) (2122) (1211) (1222)] 
{(2112) (2121) (1222) (1211)] 
{(1211) (1221) (2211) (2222)] 
[(1221) (1211) (2212) (2221)] 
{(1222) (1212) (2222) (2211)} 
[(1212) (1222) (2221) (2212)] 
[(2122) (2111) (1221) (1212)] 
{(2121) (2112) (1212) (1221)]} 


Reps. I II Il IV 
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8. v= 16, 6 = 32, r=8, k= 4, N, =2= Ng, =N,=N,, A, = 4, 
Ae = 2,A3 = 0, AX = 2. Taking the treatments as in the above example, the 
plan of the design is 


Reps. 
(1111) (1112) (2111) (2121)] 
(1112) (1111) (2112) (2122)] 
((2111) | (2122) | psa (1122)] 
{(2112) (2121) | (1122 (1121)] 
((1211) | (1212) | ee | (2221)) 
{(1212) (1211) (2212) | (2222)} 
((2211) | (2222) | (1221) | (1222)) 
[((2212) | (2221) | (1222) gee (1221 )} 


ence 


((2221) | (2211) | (1111) 
((2222) | (2212) | (4112) 
(1121) | (1122) (2222) (2211)) 
(1122) | (4121) (2221) (2212)} 
((2121) | (2111) | (1211) (1212)) 
((2122) | (2112) (1212) (1211)] 
((1221) 222) (2122) (2111)] 
((1222) | | (2121) (2112)) 


(1112)) 
(1111)] 


(2111) | (2112) | (1111) (1121)] 
(2112) | (2111) | (1112) (1122)) 
(4111) =| (1122) | (2121) (2122)) 
(1112) | (1121) (2122) (2121)} 
((2211) | (2212) | (1211) (1221 )} 
{(2212) (2211) | (1212) (1222)} 
(1211) (1222) (2221) (2222)) 
((1212) | (1221) | (2222) | (2221)] 
(1121) | s((4001) s| (2211) (2212)) 
(1122) (1112) (2212) (2211)] 
{(2221) (2222) (1122) | (41111)j 
{(2222) (2221) (1121) | (1112)) 
{(1221) (1211) (2111) | (2112)) 
{( 1222) (1212) (2112) | (2111)) 
{(2121) (2122) (1222) | (1211)} 
{(2122) (2121) (1221) | (1212)) 

Reps. I, Il Ill, IV V, VI VII, Vill 





Vil, Vill 


7, k ed 3, N, = 3, N, = 3, N; = 6 
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As = 1. Taking the treatments as 


(111) (112) (211) (212) (311) (312) 

(121) (122) (221) (222) (321) (322) 

(131) (132) (231) (232) (331) (332) 
the plan of the design is 


((111), (112), (211)] [(111), (112), (212)) 
((121), (122), (221)] ((121), (122), (222)] 
[(131), (132), (231)] [(131), (132), (232)] 
{(211), (212), (311)] (211), (212), (312)] 
(221), (222), (321)] {(221), (222), (322)] 
((231), (232), (331)] [(231), (232), (332)] 
((311), (312), (111)] {(311), (312), (112)) 
[(321), (322), (121)] {(321), (322), (122)] 
(331), (332), (131)] (331), (332), (132)] 
[(111), (221), (381)] [(111), (222), (332)] 
{(111), (231), (321)] [(111), (282), (322)] 
{(112), (221), (332)] [(112), (222), (331)] 
[(112), (231), (322)] [(112), (232), (321)]} 
[(121), (211), (331)] ((121), (212), (332)] 
[((121), (231), (311)} [(121), (232), (312)} 
[(122), (211), (332)] [(122), (232), (311)] 
[(122), (231), (312)] [(122), (212), (331)] 
((131), (211), (321)] [(131), (212), (322)] 
(131), (221), (311)] [(131), (222), (312)] 
[(132), (211), (322)] [(132), (212), (321)] 
((132), (221), (312)] ((131), (222), (311)] 


10. v = 24,6 = 16,r = 4,k = 6, N, = 3, Ne = 4, N; = 2,4, = 4,2 = 0, 
As = 1. Taking the treatments as 


(111) (112) (211) (212) (311) (312) 

(121) (122) (221) (222) (321) (322) 

(131) (132) (231) (232) (331) (332) 

(141) (142) (241) (242) (341) (342) 
the plan of the design is 


[((111), (112), (211), (212), (311), (312)] 
((111), (112), (221), (222), (321), (322)] 
((111), (112), (231), (232), (331), (332)] 
[(111), (112), (241), (242), (341), (342)] 
((121), (122), (211), (212), (321), (322)] 
((121), (122), (221), (222), (331), (332)] 
((121), (122), (231), (232), (341), (342)] 
[(121), (122), (241), (242), (311), (312)] 
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(132), (211), (212), (331), (332)) 
(132), (221), (222), (341), (342)] 
(132), (231), (232), (311), (312)] 
(132), (241), (242), (321), (322)] 
(142), (211), (212), (341), (342)] 
(142), (221), (222), (311), (312)] 
(142), (231), (232), (321), (322)] 
(142), (241), (242), (331), (332)] 





RELATIONS AMONG THE BLOCKS OF THE KRONECKER 
PRODUCT OF DESIGNS 


By MANoHAR NARHAR VARTAK 
University of Bombay 


1. Summary and Introduction. In the case of some incomplete block designs, 
interesting relations among their blocks have been discovered. For example, 
Fisher [1] has shown that in the case of a symmetrical BIB (Balanced Incomplete 
Block) design with parameters v = b, r = k, \, any two blocks have exactly \ 
treatments in common. Similarly, Bose [2] has shown that in the case of an 
affine resolvable BIB design with parameters 


v= nk = n'{(n — 1)t + I, b = nr = nint+n+ li, A=nt +1, 


the blocks can be divided into sets of n blocks, such that each set is a complete 
replication and any two blocks have (k’)/v = (nt — t + 1) or O treatments 
in common according as they belong to different groups or the same group. Also 
see Connor [3] and Bose and Connor [4] for similar results. 

Confining our attention to PBIB (Partially Balanced Incomplete Block) 
designs with two or three associate classes, we wish to see how this type of in- 
formation for blocks of BIB designs can be used to obtain similar information 
for the blocks of their Kronecker product. 

In the next section are given a few general properties of the Kronecker product 
of designs. In Section 3 the main theorems of the paper are proved and their 
important particular cases are discussed. Some observations on the interconnec- 
tion between these results and the theorems on inversion of designs (cf. Roy 
[5], Shrikhande [6]) are made in Section 4. 


2. Some general properties of the Kronecker product of designs. We shall 
always denote the Kronecker product of matrices A and B by A X B (ef. 
Vartak [7]); and the ordinary product of A and B, whenever it exists, will be 
denoted by A-B or AB. The Kronecker product of designs was defined in {7] 
as the design whose incidence matrix is the Kronecker product of the incidence 
matrices of the given designs. 

We shall consider throughout this section two designs N,; and N», with », and 
v, treatments and b, and by blocks respectively. The design Ni whose incidence 
matrix N; is the transpose of N,, is said to be the design obtained from N, by 
inversion [5], or dualization [6]. Similarly for the design N: . Since the Kronecker 
product of matrices satisfies the law 


(2.1) (A X B)’ = A’ X B’ 


we get the following result for the inversion of the Kronecker product of designs. 
TueoreM 2.1. The design obtained by the inversion of the Kronecker product of 
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two given designs is the same as the Kronecker product of the inversions of the given 
designs. Thus if N, and Nz are both symmetric (or self-dual), their Kronecker 
product is also symmetric (or self-dual). 

In many cases we are interested in the ma‘rix NN’ where N is the incidence 
matrix of a given design. Let N = N, X Nz, where N, and N; are the given 
designs. Clearly the Kronecker product of matrices satisfies the relation 


(2.2) (AB) XK (CD) = (A XK C)-(B X D) 


where A, B, C and D are matrices of orders m K k, k K n, p X j andj X q 
respectively. Both sides of (2.2) are then mp X ng matrices. Hence we get 

TueoremM 2.2. The matrix NN' for the Kronecker product N = N, X N; of two 
given designs is the Kronecker product of the corresponding matrices for the given 
designs; similarly for the matrix N’N. 

Finally, we need the following two results from [7]. 

2A. The Kronecker product N = N,(BIB) X N,( BIB) of two BIB designs 
N,( BIB) and N;( BIB) defined by the respective sets of parameters 


(2.3) n1, bh Tis ky ’ Mi 
and 
(2.4) V2, be, 2, ke, Ae 


is a PBIB design with at most three associate classes. 

The three associate classes of the design N defined above are all distinct if 
ride F Te. 

In any case, the parameters of the design N can be expressed in terms of those 
of the BIB designs given by (2.3) and (2.4) by the following equations 


Y=un, W=bh, =n, kh = kik, 


, , 
= -— l, Wm =- Wy c Ng = Mmm, 


=e, Awe Pr, Ae= Ade, 


tse=- 2 0 0 
(p,) = »— 1 . 
e (vm, — 1)(v, — 2) 


%— 1 
(py) = h-% 0 ' 
) (v, — 2)(m — 1) 


0 %e=- 2 
(pi) =| 1 », —2 
v= 2 —m=- 2 (v; —_ 2) (v2 == 2) 


where y, z = 1, 2, 3. 
As a direct consequence of 2A and Theorems 2.1 and 2.2, we get the following 
corollary 





774 MANOHAR NARHAR VARTAK 


Coro.uary 2.1.1. If a symmetrical PBIB design (i.¢., one with v = b and hence 
r = k) with three associate classes and parameters (2.5) is the Kronecker product 
of two symmetrical BIB designs, then with respect to any block B in it, the other 
blocks fall into three groups (a), (8) and (vy) such that the group (a) contains 
m, blocks each having , treatments in common with B, the group (8) contains nz 
blocks each having 2 treatments in common with B, and the group (vy) contains 
n; blocks each having ds treatments in common with B. 


Proor. Let the given symmetrical PBIB design N be the Kronecker product 
of the symmetrical BIB designs N,(BIB) and N;(BIB) with respective sets of 
parameters 


v, = bd, n=k, A 
and 
Vv, = be, Tr, = ke, Ae. 
By the well known result in [1], it follows that 
Ni(BIB)-Ni(BIB) = (r1 — Mi)In, + ME nn 


where /,, is the identity matrix of order », and E,,,, is the matrix of order »; K % 
with all elements equal to 1. Similarly for the design N;2( BIB). Since 


N = N,(BIB) X N2( BIB), 
it follows from Theorem 2.2 that 
N'N = {(ri — An)Ie, + AEoye,} XK { (72 — Az) Log + ArH vaes}, 
which, in virtue of (2.5), simplifies to 


where 


A = (7 — Mi)Ivg + AE vans 
and 


B = (A — As) Teg + AsE vgn - 


The result of Corollary 2.1.1 follows from the fact that the element in the ith 
row and the jth column of N’N equals the number of treatments common to the 
ith and the jth blocks. 

2B. A set of necessary and sufficient conditions for the Kronecker product 


N of the two BIB designs given by (2.3) and (2.4) to have only two distinct 
associate classes is given by 


(2.6) i= % =v say, and ki =k=k say. 
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If these conditions are fulfilled, then it follows that 
(2.7) be/by = 12/71 = e/a = Bw, «BAY 


where yu is a positive fraction; and in this case the parameters of N can be ex- 
pressed in terms of those of the BIB designs by the equations 


Y=, r=pi, kK =P, 


ny = 2(v — 1), mg = (0 — 1)', 


(2.8) i = uri , = uri, 


n v—2 g=— i ‘2 - 2(v — 2) 
Wer Lt -1 (v—1)- »| mS E wipe 2} 
where y, z = 1, 2. 


Both the results 2A and 2B are particular cases of a general result, Theorem 
4.2 of [7]. 


3. The Main Theorems. Let N be the incidence matrix of a given design with 
parameters v, b, r, k. Then the matrix 


(3.1) N'N = (nis); t,J = 1,2,--- , 6; 


is such that its general element ni j gives the number of treatments common to 
the ith and the jth blocks of N. If N, and N; are two designs with parameters 
v,, db, tT: , ky and v2, be, T2, ke respectively, then the matrix N’N for the Kro- 
necker product N = N, X Nz; is given by 


(3.2) N’N = (NiNi) X (N3N2). 
From this we get the following theorem. 

TuHroreM 3.1. If in the design N, there exists a pair of blocks having m, treat- 
ments in common and in the design N: a pair of blocks having mz, treatments in 
common, then in their Kronecker product N = N, X N; there exists a pair of blocks 
having mm, treatments in common. 

Proor. It is clear that m will be an element of NiN; and m, of NN; ; 80 
that by (3.2) N’N will contain mm, as an element. This proves Theorem 3.1. 

Now consider a block B™ of the design N; and let bf” of the totality of the 
blocks of N, have each i treatments in common with B™; i = 0, 1, 2,--+, kh. 
Clearly 5“*4, bf? = b,. Let B® and bf” have similar meanings for the design 
N; so that >-52,b§? = b,. Remembering that the blocks of N; are of size k, and 
those of N, are of size k, we get the following theorem. 

Tueorem 3.2. If there exist blocks B® and B® in the designs N, and Nz re- 
spectively having the above properties, then there exists a block B in the Kronecker 
product N = N, X Nz such that b&” blocks of N have each u treatments in common 
with B, where b°? is the coefficient of {u} in the expression 


k k 
(3.3) (= ort (S 0" 171) 
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where the symbols {u} obey the ordinary laws of algebra, viz., 
aju} + bju} = (a+ b)fu}, 

(3.4) tu} to} = fojiuj = fw}, 
(afu})(bfv}) = abjw}. 


Proor. From the conditions satisfied by the block B™ of N, we find that the 
matrix N,N, contains a row of the form 


(3.5) a = (0,0,--- , 0, I, i,-+*,i,-*:, ky, ki, +++, hh), 


where the integer i is repeated b$” times; i = 0, 1, --- , ky - Similarly from the 
properties of the block B® of N, we find that the matrix N;N2 contains a row 
of the form 


(3.6) p= (0,0,---,0, RB, i,eoe, A,eee, ka, ke, +++ , ke), 


where the integer j occurs bf” times; j = 0, 1, --- , ke. The matrix N’N for the 
Kronecker product N = N, X N, will clearly contain a row p = p, X pr. 

Now pick out the integer 0 in p. It arises b, times when each of the bs” zeros 
in p; is the coefficient of p, in p, and also b, times when each of the b,” zeros in 
p2 multiplies the elements of p; . But in this enumeration of zeros, the multiplica- 
tion of zeros of p; and p: has been counted twice, so that actually the number 
of zeros in p is bs” = b§b, + bob. — b§b§”, which is exactly the coefficent 
of {0} in (3.3) when expanded according to the properties (3.4). 

Similarly, the integer 1 occurs only in those places where one of the };” 
1’s of p, multiplies one of the b{” 1’s in p,. Hence we must have b{” = bf” -b;” 
which is exactly the coefficient of {1} in the expression (3.3) when expanded 
according to the properties (3.4). 

In the same way, it is easy to verify that the integer u occurs in p 6” times 
where b{” is the coefficient of {u} in (3.3). This proves the theorem. 

From the block structures of affine resolvable BIB designs [2] and symmetrical 
BIB designs [1], we can easily deduce the following corollaries of Theorem 3.2. 

Coro.uary 3.2.1. Jf a PBIB design with three associate classes and with param- 


eters (2.5) is the Kronecker product of the affine resolvable BIB design with param- 
eters 


vy = nk, = n'{(n — 1)t + I, 
(3.7) 
bh = nr, = alnt+n+ li, A, = nt + 1, 
and the symmetrical BIB design with parameters 


(3.8) 2 = be ’ % = ke , do ’ 


then with respect to any block B in it, the other blocks fall into four groups (a), (8), 
(y), and (6) such that the group (a) contains ni = by — 1 = v2 — 1 blocks each 
having k\_ treatments in common with B, the group (8) contains b, — n blocks 
each having mr, treatments in common with B, the group (vy) contains ni(b) — n) 
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blocks each having mdz treatments in common with B, and the group (6) contains 
be(m — 1) blocks each having zero treatments in common with B, where 


m = (k)*/n = (n — 1)t + 1. 


The groups (a), (8), (vy), (4) are all distinct if ne # ke. 
Coro.uary 3.2.2. Jf a PBIB design with three associate classes and with param- 


elers (2.5) is the Kronecker product of the two affine resolvable BIB designs with 
parameters. 


v, = mk, = ni{(m — 1) + UY, 


(3.9) 
b) = mri = minit, + m + 1}, Ar = ml, + 1 
and 
ve = Nok, = n3{ (ne — 1l)e+ lj, 
(3.10) ‘ 
be = Nears = Ma{nale + m2 + 1}, A2 = Nel. + 1, 


then with respect to any block B in it, the other blocks fall into four groups (a), (8), 
(y) and (6), such that the group (a) contains be — nz blocks each having m2k, treat- 
ments in common with B, the group (8) contains b, — m, blocks each having mk, 
treatments in common with B, the group (7) contains (b; — m)(bz — m2) blocks 
each having mm, treatments in common with B, and the group (8) contains 


bi(m2 — 1) + be(m: — 1) — (mm — 1)(m — 1) 
blocks each having zero treatments in common with B, where 
m, = (k)*/m = (m— 1)h +1 and m: = (ke)?/ve = (m — 1) + 1. 
The groups (a), (8), (vy), (8) are all distinct if m # ne. 


4. Concluding remarks. 

(i) A similar analysis can be carried out for the PBIB designs with two asso- 
ciate classes which are Kronecker product of BIB designs (cf. 2B above). 

(ii) It is easy to see that a PBIB design which is the Kronecker product of 
a resolvable BIB design and another BIB design is also resolvable. 

(iii) It is interesting to note clearly the connection between corollaries to 
Theorem 3.2 on the one hand and Theorem 2.1 on the inversion of designs on 
the other. For example, from Corollary 3.2.1 one may gather the false impres- 
sion that the Kronecker product of an affine resolvable BIB design and a sym- 
metrical BIB design would lead on inversion to a PBIB design with four asso- 
ciate classes. Remembering, however, that an affine resolvable BIB design gives 
on inversion a PBIB design with two associate classes and that a symmetrical 
BIB design is self-dual, we find from Theorem 2.1 and Theorem 4.2 of [7], that 
the dual of the Kronecker product under consideration is, in fact, a PBIB de- 
sign with five associate classes all of which are distinct. This apparent contradic- 
tion is resolved if we observe that the number of distinct associate classes in a 
PBIB design depends not only on its \ parameters but also on the matrices 
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(pj) of its secondary parameters, the exact relation being given in Lemma 4.1 
of [7], whereas for finding relations among the blocks of the inverted design we 
are concerned only with the number of different \ parameters of the PBIB de- 
sign. Thus in the example under discussion, the PBIB design with five distinct 
associate classes has two of its \ parameters equal to zero, and therefore there 
are only four different \ parameters which determine the four types of relations 
among the blocks of the inverted design. 

Similar remarks apply to the PBIB designs obtained in 2B and their duals. 

Further work of this type applicable to PBIB designs in general is under 
progress and the author hopes to publish a separate paper dealing with it. 


Acknowledgment. I wish to express my sincere thanks to Professor M. C. 
Chakrabarti for his kind interest in this work. I am also indebted to the referee 
for his helpful suggestions. 
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THE DUAL OF A BALANCED INCOMPLETE BLOCK DESIGN 
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1. Summary. Shrikhande [9] and Roy [7] have shown that certain Balanced 
Incomplete Block Designs (BIBDs) can be dualised to give Partially Balanced 
Incomplete Block Designs (PBIBDs) with exactly two associate classes. Roy 
and Laha [8] have obtained a necessary and sufficient condition for the dual of 
a BIBD to be a PBIBD with two associate classes. In this paper, a general re- 
sult regarding the dual of a BIBD is established and the results of Shrikhande 
and Roy are obtained as particular cases. An illustration to show the use of the 
result when the dual is not a 2-associate PBIBD is also given. 


2. Two Lemmas connecting the parameters of a BIBD. For the definition of 
a BIBD the reader may refer to Kempthorne [4]. The following two lemmas will 
be stated without proof. Lemma 2.1 is due to Connor [2], while Lemma 2.2 is 
due to Hussain [3]. 

Lemma 2.1: If 1;; is the number of treatments in common with the ith and the 
jth blocks of a BIBD with parameters v*, b*, r*, k*, X*; the following inequalities 
hold: 

(2.1) [2\*k* + r*(r* — X* — k*)\/r* = ly S —(r* — A* — k*). 

Lemma 2.2: If n, denotes the number of blocks having u — 1 treatments in com- 
mon with a chosen initial block of a BIBD with parameters v*, b*, r*, k*, X*, and 
t is the largest integer contained in [2\*k* + r*(r* — \* — k*))/r*, such that 
t < k + 1, the following equalities hold: 


t+1 


(2.2) 2M ye — 1, 
t+1 


(2.3) > (u — 1)n, = k*(r* — 1), 


ual 
t+1 


(2.4) 2, (u — 1)(u — 2)n, = k*(k* — 1)(A* — 1). 

Note that if (2.2), (2.3) and (2.4) admit a unique nonnegative integral solu- 
tion, then, corresponding to each block of the design, the remaining b* — 1 
blocks may be divided into t + 1 = m groups such that a block in the uth group 
has exactly u — 1 = Ay (uw = 1, 2, --- m) treatments in common with the 
chosen initial block, there being exactly n, blocks in the uth group. 


3. The definition of a PBIBD. An incomplete block design is said to be a 
PBIBD if it satisfies the following conditions: 

(3.1) There are v treatments divided into b blocks of k plots each, different 
treatments being applied to the plots in the same block. 
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(3.2) Each treatment occurs in exactly r blocks. 

(3.3) There can be established an association relationship between any two 
treatments satisfying the following conditions: 

(3.3a) Two treatments are either Ist, 2nd, --- mth associates. 

(3.3b) Each treatment has exactly n, uth associates (u = 1, --- m). 

(3.3c) Given any two treatments which are kth associates, the number of 
treatments which are the uth associates of the first and u’th associates of the 
second is Pt, . Also, PL,» = P*.,. 

(3.4) Two treatments which are uth associates will occur together in exactly 
Au(u = 1, 2, --- , m) blocks. 

For the necessary conditions satisfied by the parameters of a PBIBD the 
reader is referred to Bose and Nair [1] and Nair and Rao [6}. 


4. The dual of a design. Let B,, B., --- , Bye and T,, Tz, --- , Tye denote 
the blocks and treatments of a given design, D*, in which v*(=b) treatments 
are arranged in b*( =v) blocks of k*( =r) plots each such that every treatment 
is replicated r*(=k) times. Let D be a new design with v treatments and b 
blocks constructed by placing the treatment numbered 7 in block numbered 7 
of D, if in D* the block B; contains the treatment 7, . The designs D* and D 
are said to be the duals of each other. Evidently, in D each block contains k 
plots and each treatment is replicated r times. Further, if N* = (n,;), 
(t = 1,2, ---,v*;7 = 1, 2, --- , b*), where n;; denotes the number of times 
the ith treatment occurs in the jth block, is the incidence matrix of D*, the 
incidence matrix of D is (N*)’, where (N*)’ is the transpose of N*. Also the 
element in the ith row and the jth column of the v* X v* matrix (N*)’N* will 
be equal to the number of blocks in the dual design D in which the ith and the 
jth treatments occur together. 


5. The dual of a BIBD. Consider a BIBD with parameters v*(=b), b*(=»), 
r*¥(=k), k*(=r), A*. Let N* = (n,;) be the incidence matrix. We have, by the 
well known properties of a BIBD, 


(5.1) N*(N*)' = A*Eye + (r* — A*)Iee, 


where E,- is a v* X v* matrix with all elements unity and J,- is a v* X v* identity 
matrix. Also, 


7* 
(5.2 ) ( N* )’N* — (2 nim’) mK Aji" ), 
t=] 


where, as already observed in the previous section, A,;;- is the number of treat- 
ments common to the jth and the j’th blocks of the original BIBD, which is 
also equal to the number of blocks of the dual design in which the jth and the 
jth treatments occur together. Thus, in the dual design, a pair of treatments 
can occur together in at most ¢ blocks, where ¢ is defined as in Lemma 2.2. Fur- 
ther, if the equations (2.2), (2.3) and (2.4) admit a unique integral non-negative 
solution, in the dual design, corresponding to each treatment, the remaining 
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v — 1 treatments can be divided into t + 1 = m groups, such that a treatment 
in the uth group will occur in exactly A, = u — 1 (u = 1, 2, --- , m) blocks 
with the initial treatment, and, there will be exactly n, treatments in the uth 
class. At this point, it may be noted that we do not exclude the possibility of 
some of the n,’s being zero, in which case the exact number of classes will be 
less than m. In fact, the total number of groups will be exactly equal to the total 
number of non-null n,’s. 

We now proceed to investigate the conditions under which the dual will be a 
PBIBD. Evidently, if the equations (2.2), (2.3) and (2.4) admit a unique 
integral non-negative solution, then the conditions (3.1), (3.2), (3.3a), (3.3b) 
and (3.4) are satisfied by the dual design. Hence it remains to see when (3.3c) 
will also be satisfied. 

Define mv X v matrices B,(u = 1, 2, ---,m) as 


(5.3) B, = (b3j-) 57 = 1,2, +++, 8% 


where bj; = 0 for all j, and bj, = 1 if Ay; = \, and O otherwise, for all 7 # 7’. 
The matrices B, are symmetric, independent, and commutative with respect 
to multiplication. It is also clear that 


(5.4) 2 bisbiy = Do bude = Ci, 
which is the number of treatments common to the uth and u’th groups of treat- 
ments with respect to the treatments numbered i and 7 in the dual design if 
i # j. It equals n, if i = j and u = w’, and it equals zero if i = j and u # w’. 

Now consider any block, B;, of the original BIBD. There will be n, blocks 
in the design that have exactly \,, treatments in common with B; . Of these n, 
blocks, C7" blocks will have \,, treatments in common with the block B; Hence 


- m 


5) "a1 CY’ = n, if the blocks B,; and B; do not have d, treatments in 
common, 


(5. 


= n, — 1 otherwise. 


Now using (5.2) and (5.3), and observing that \;, = k*, we have, 


(5.6) (N* N* = k*Iy. + pa By ’ 


and hence, 
(5.7) [((N*)'N*[(N*)’N*] = (N*)'(N*(N*)'\N* 

= (N*)’[A*E,e + (r* — X*)Iye|N* 

= \*(N*)’/E,N* + (r* — X*)(N*)'N*. 
As N* is the incidence matrix of a BIBD it is easy to verify that 


(5.8) (N*)’E,-N* = (k*)* Eye . 
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and that the left hand side of (5.7) can also be expressed as 


(5.9) (N*)'N* [ket +> x B. |. 
ual 


Hence, using (5.8) and (5.9) and noting that \, = 0, we get from (5.7), 
k*(N*)'’N* + (N*)'N* (> Bs) = \*(k*)*Eye + (r* — A*)(N*)'N*. 
u=2 
Hence, from (5.6), 
N*(k*)*Bye — k*(k* — r* — dA*) Ie 
m m 2 
= (2k* — r* + r*) UBL + Pisa 


ua? ua? 


Hence 
A*(K*) Eye — k*(r* — k* — dA*)I,* 


— = (2k* — * + *) DAB. + DD Be. 


Comparing the (7j)th non-diagonal terms on both sides of (5.10), 
XE ade D debe = AM(k*)* — (2k* — r* + 0*) DADS. 

Using the notation of (5.4), 

(511) DOD ddw CH = v*(k*)? — (2k* — r* + d*) DDG. 


We can divide the set of (b*)* equations (5.11) into m mutually exclusive 
sets such that the gth set (¢q = 1, 2, --- , m) contains all the equations with 
Ci; for \4; = A, The coefficients in the left hand side, and the constant in the 
right hand side, are same for all the equations in a given set. In fact, the equa- 
tions in the qth set will be obtained by giving all the values to 7 and 7 such that 
Aiy = A, in 


(5.12) DX dwdwOH* = A*(k*)? — (2k* — r* + A*)Ag. 
Thus it is clear that the values of C7;"" depend only on A, , Aw and A,; . Hence, by 


writing Cy" = Pt. if i; = A,, the equations (5.6) and (5.12) may be re- 
written as 


(5.13) cmt Poul = Ny if u # q, 


a | ifu = q; 
and 


(5.14) Qo Do Aww Pha = A*(k*)? — (2h* — r* +-A*)AQ, GQ = 1,2, -°+, m. 


Hence, if (5.14) has a unique integral non-negative solution, it follows from (5.4) 
and (5.13) that the number of treatments common to the uth group and u’th 
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group of two treatments is the same for all treatment pairs which belong to the 
qth group with respect to each other. This number is equal to Pt, with Pf, = 
P%-. . Thus we have proved Theorem 5.1. 

TuHrorem 5.1: The dual of a BIBD with parameters v*( =b), b*( =v), r*( =k), 
k*(=r), \* is a PBIBD with parameters v, b, r, kj \y, dey *** 5 Ami Mh, Me, 

++ nm; Pla (u, u’,qg = 1,2, --+,m), wherem = t + 1 isdefined asin Lemma 
2.2, provided the equations (2.2), (2.3), (2.4) and (5.14) admit unique integral 
non-negative solution subject to the conditions (5.13). 


6. Shrikhande’s two theorems as particular cases of the Theorem 5.1. 

(6.1) The case \* = 1. Consider a BIBD with parameters v*(=b), b*( =p), 
r*(=k), k*(=r), A* = 1. In this case we have ¢ = 1 and the equations (2.2), 
(2.3) and (2.4) reduce to mn, + m = b* — 1 and m = k*(r* — 1), giving the 
unique non-negative solution 


m = (v— 1) —r(k—1), 
m = r(k — 1). 


Noting that 4, = 0 and \, = 1, we can solve the equations (5.14) uniquely to 
get the solution P, = r’, Pe = °° — 2r+k—1= (r— 1)? + (k — 2). The 
other parameters can be easily obtained by using condition (5.13). 

Thus we have proved Shrikhande’s [9] Theorem 1 that the dual of a BIBD with 
parameters o* = rk —k + 1, 0% = k(rk —k +1)/r,r% =k, k* =r, dX* = 1 
is a PBIBD with parameters» = k(rk — k + 1)/r,b = rk —k + 1lre=r, 
k = ki yy = 0,2 = 13m = rik — 1), m = (k — r)(r — 1)(k — 1)/r; 

; Ieee nena tne a5 rk—r— te 


me rik —r—1) r 


Puw = 


. ee lDik—r)(k-r—1)/r (r— 1)(k—1r) 


(r — 1)(k — r) (k — 2) + (r — 1)? 


(6.2) The case \* = 2. It can be easily seen that, if we exclude the solutions in 
which the same block is repeated, for all designs with A* = 2 andr S 10, we 
must have t = 2. In this case the equations (2.2), (2.3) and (2.4) will have the 
unique solution given by 


(b* — 1) — k*(r* — k*) — k*(k* — 1)/2, 
= k*(r* — k*), 
= k*(k* — 1)/2. 


But, in general, equations (5.14) will not have a unique solution. However, if 
we consider the particular case n; = 0, i.e. when r* = k* + 2, the equations 
(5.14), when g = 3, reduce to Ph + 4(P + Pi) = 2k*(k* — 1). Hence, 
using (5.13), we get, Pie = 2k*(k* — 1) — 4(m, — 1) = 4. Similarly, the other 
parameters may be found. Hence we have proved Theorem 3 of Shrikhande [6) 
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that the dual of a BIBD with parameters 


wa(*5), we), ean 


is a PBIBD with parameters 


2(k — 2), -(*3"): 


E-2 k-8 2(k — 4) 


Pw = k-3\|[5 Pw= k-4 


Roy’s [7] Theorem 3, regarding the dual of an affine resolvable BIBD, can be 
proved in a similar way by using Theorem 5.1 of this paper. 


7. Application of Theorem 5.1 when the solution of the equations is not unique. 
When the solution of the equations (2.2), (2.3), (2.4) and (5.14) is not unique, 
Theorem 5.1 will not give complete information about the dual. However, if 
the structure of the original BIBD is known, Theorem 5.1 can be used to simplify 
the investigation about the properties of the dual. As an illustration, we con- 
sider the dual of a BIBD with parameters »* = 16, b* = 24, r* = 9, k* = 6, 
\* = 3. A plan of this design is given by Mann [5]. He constructed it by the 


process of residuation from the symmetric BIBD with parameters v* = b* = 25, 
r* = k* = 9, \* = 3. We shall denote Mann’s design by D*. 

Since any two blocks of a symmetric BIBD must have \* treatments in com- 
mon, any two blocks of the design D* cannot have more than three treatments 
in common. Hence we must have ns = nme = n7 = 0. Thus the equations (2.2), 
(2.3), and (2.4) can be written as 


m=5— 1%, 
ng = 3 (m4 — 4), 
nz = 3 (10 — n). 


From inspection of Mann’s plan we can see that no two blocks of the design D* 
have exactly one treatment in common. This gives the unique solution, n, = 1, 
ne = 0, ns = 18, ny = 4. For the sake of simplicity, we shall write mn, , nz, ng 
instead of n; , n3, ms and make corresponding changes in P%,. Now, asm = 1 
and Pi; + Piz + Pis = m — 1 = 0, we must have P}; = Piz = Pj; = 0. There- 
fore, the equations (5.14), when q = 1, may be solved uniquely to get the values 
of Phy-(u, uw = 2, 3.). Again, if the dual of the design D* isa PBIBD, then 
P?, and Pi; must both be unique and equal to (n;/n2) Pi: and (n;/n3)Phs respec- 
tively. It can be verified that, for the design D*, the values of Pj, and P}; satisfy 
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these conditions and are both equal to 1. Hence, as n, = 1, it follows from (5.13) 


that Pi, = Pi, = Ph = Ph = 0. It is now easy to see that the equations (5.14) 
will have the unique solution 


Ply 


Pi, 


18 
0 


Hence the dual of the design D* is a PBIBD with the parameters »v = 24,b = 16, 
r = 6,k = 9;\; = 0, A: = 2, A, = 3; m = 1, my = 18, n; = 4; PL. (u,w’,¢ 
= 1, 2, 3.). 

Roy and Laha [8] have already pointed out that this PBIBD may be ob- 
tained as the dual of a BIBD. However, they have not stated how they arrived 
at this conclusion. 


moc Of Oo OCS 
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NOTES 


ON THE UNBIASEDNESS OF YATES’ METHOD OF ESTIMATION 
USING INTERBLOCK INFORMATION’ 


By Franxkuin A. GRAYBILL AND V. SESHADRI 


Oklahoma State University 


In a balanced incomplete block model with blocks and errors random normal 
variables, Yates has shown that there are two independent unbiased estimates 
for any treatment contrast. These are referred to as intrablock and interblock 
estimators. Yates has also given a method for combining these two estimators 
which depends on the variances (unknown) and has shown how to estimate the 
variances from an analysis of variance [1]. Since this combined estimator is used 
quite extensively, it seems desirable to study its properties. Graybill and Weeks 
[2] have shown that Yates’ combined estimator is based on a set of minimal 
sufficient statistics and have presented an estimator which is unbiased. 

The purpose of this note is to show that Yates’ estimator, which is based on intra- 
block and interblock information, is unbiased. 

The model and distributional assumptions in this paper are exactly those 
given in [2], and the same notations are used and will not be repeated here. 

In [2] it is shown that Yates’ estimator (denoted by 7;) of r, is 


t= 2+ (us — 2) if ¢>0 
= 2; + AM/rk(u; — 2;) if #4<0 


(1) 


BMY Sh) po wens. Ak oge , ACK — t) 
mea oda tne + Fe —1) 
WUE es ite Gere ay YE eae eo 2 
ke ay x)(U - X)+—— 5 - Cog 7 Ss 


s 


and where 
(3) 63 = 1/t(r — 1)[\t(r — A) /rR(U — X)'(U — X) + S® — (6 — 1)/f8"] 
We now define ¢(65) such that 
o(63) = 0 if ¢;>0 
=] if <0 
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Yates’ estimate can now be written as 
(4) # = [1 — @(65))[ai + v(us — 2] + 6(65)[2, + (At/rk) (us — 2)] 
Clearly (4) is equivalent to (1). Rearranging and simplifying (4) we get 
# = [xe + y(us — 24)] + 6(65)[(At/rk) — y](us — 24) 
Graybill and Weeks have shown in [2] that E[z, + y(u; — z,;)] = r,. Therefore 
in order to show that Yates’ estimate is unbiased we need only show that 
Elo( 65) ((At/rk) — y)(us — 2:)] = 0 

Let z; = (u; — z;) where i = 1,2, ---,t — 1. Now 4; is a function of z,, S, 
and S’. So let 

65 - g(%1, 2%, — » Ha, S*, S*). 
7 is also a function of z;, S*, and S*. Therefore, let 

Vi? h(a, 2, A » Ba, S*, S*). 
Denote the joint density of the ¢ + 1 random variables z, , 2, --+ , 2:4, s* Ss 
by f(z, 22, °°", 2-1, S™, S*). From (2) it is clear that 7 is an even function 
of the z; and from (3) we see that 4} is also an even function of the z,; . Therefore, 
¢(65) is an even function of z;, (i = 1, 2,---,¢t— 1) and o(63)[(At/rk) — 7) 
is also an even function of z;. Hence $(63)[(t/rk) — y](u; — 2;) is an odd 
function of z;. Therefore, 


E\o(63)((at/rk) — y)(us — 2x)) = 0, 


since z; are independent normal variables with mean zero and are independent 
of S* and S*. Thus Yates’ estimator, which is based on intrablock and inter- 
block information, is unbiased. 
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ON THE BLOCK STRUCTURE OF CERTAIN PBIB DESIGNS WITH TWO 
ASSOCIATE CLASSES HAVING TRIANGULAR AND L, 
ASSOCIATION SCHEMES 


By Damarasu RAGHAVARAO 
University of Bombay 


0. Summary. The PBIB designs [2] with two associate classes are classified in 
[3] as 1. Group Divisible, 2. Simple, 3. Triangular, 4. Latin Square type with i 
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constraints, and 5. Cyclic. Group Divisible designs are divided into three types 
{1]: 1. Singular, 2. Semi-regular, and 3. Regular. It has been proved [1] that 
every block of a Semi-regular Group Divisible design contains k/m treatments 
from each of the m groups of the association scheme. In this note we prove 


analogous results in the case of certain PBIB designs with triangular and L, 
association schemes. 


1. On the Block Structure of certain PBIB designs with two associate classes 
having a triangular association scheme. A PBIB design with two associate 
classes is said to have a triangular association scheme [3] if the number of treat- 
ments v = n(n — 1)/2 and the association scheme is an array of n rows and n 
columns with the following properties: 

(a) The positions in the principal diagonal are blank. 

(b) The n(n — 1)/2 positions above the principal diagonal are filled by the 
numbers 1, 2, --- , n(n — 1)/2 corresponding to the treatments. 

(c) The array is symmetric about the principal diagonal. 

(d) For any treatment @, the first associates are exactly those treatments 
which lie in the same row and same column as @. 

It is then obvious that 

(1) the number of first associates of any treatment is n; = 2n — 4, and 

(2) with respect to any two treatments @,; and 6, which are first associates, 
the number of treatments which are first associates of both 6, and 6 is 
pil , 02) =n — 2. 

We now prove 

TueroreM 1.1. Jf in a PBIB design with two associate classes having a triangular 
association scheme 


(1.1) rk — vy = n(r — d;)/2, 


then 2k is divisible by n. Further, every block of the design contains 2k/n treatments 
from each of the n rows of the association scheme. 

Proor. Let e} treatments occur in the jth block from the ith row of the associ- 
ation scheme (i = 1, 2, --- ,n;j7 = 1, 2, --- , b). Then we have 


ej = (n—l)r, 


ej (e} — 1) = (n— 1){m — 2)A,, 
3 


since each of the treatments occurs in r blocks and every pair of treatments 
from the same row of the association scheme occurs together in \, blocks. From 
(1.2), we get 


b 
(1.3) > (ej)? = (n — 1){r + (n — 2)A}}. 
j=] 
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Define ef = 6" >°$_, e} = (n — 1)r/b = 2k/n. Then 
b 
> (es — e:)’ (n — 1){r + (mn — 2)A} — 4dK/n’ 


j=l 

= 2(n — 1){{n(r — \,}/2} — (rk — vd\)\/n 

= 0, 
from (1.1). Therefore e} = e; = --- = eh = e° = 2k/n. Sincee} (i = 1,2, --- ,n; 
j = 1,2, ---, 06) must be integral, 2k is divisible by n. This completes the proof 
of the theorem. 

It has been proved ((4}, [5], [7]) that a PBIB design with two associate classes 

satisfying the relations (1) and (2) has a triangular association scheme for all 
n except 8. Using this result and Theorem 1.1, we have 


Coro.uary 1.1.1. A necessary condition for the existence of a PBIB design with 
two associate classes having the parameters 


(1.5) v=n(n —1)/2,b,7, ky, 2, m = 2n — 4, pu = n — 2, 


where rk — vy = n(r — \,)/2 and n # 8, is that 2k is divisible by n. 
Now let us consider the PBIB design with parameters 


v = n(n — 1)/2, b = (n — 1)(n — 2)/2, r=n— 2, 
(1.6) k nm, = 2n — 4, Ng = (n — 2)(n — 3)/2, 


Mi Ae = 2, pu =n — 2, pu = 4 


This PBIB design has been shown to have a triangular association scheme [8]. 
Further, the parameters satisfy relation (1.1). Hence every block of this design 
contains 2k/n = 2 treatments from each of the n rows of the association scheme. 


2. On the Block Structure of certain PBIB Designs with two associate classes 
having a L, association scheme. A PBIB design is said to have a L, association 
scheme [3], if the number of treatments v = s*, where s is a positive integer, 
and the treatments can be arranged in an s X s square such that treatments 
in the same row or the same column are first associates, while others are second 
associates. The following results are easily seen to hold in this case: 

(i) The number of first associates of any treatment isn; = 28 — 2. 

(ii) With respect to any two treatments @, and 6, which are first associates, 
the number of treatments which are first associates of both 6, and 6 is pj; = 8 — 2. 

We now prove 

THeoreM 2.1. If, in a PBIB design with two associate classes having a L» associ- 
ation scheme, 


(2.1) rk — vy = a(r — vy), 
then k is divisible by s. Further, every block of the design contains k/s treatments from 
each of the s rows (or columns) of the association scheme. 

Proor. Let ff treatments occur in the pth block from the qth row (or column ) 
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of the association scheme (p = 1, 2, --- , b; +++, 8). We then have 


oss = ar, 


(2.2) 


LSS — 1) =s8(s—1)M, 


since each of the treatments occurs in r blocks and every pair of the treatments 


from the same row (or column) of the association scheme occurs together in \, 
blocks. 


From (2.2), we get 
b 
(2.3) > (£8)? = sir + (s — 1)al}. 
p=l 


Define f¢ = b'>-3 4, ff = sr/b = k/s. Then 
b 


4) > (ff — f%)® = s{n + (8 — 1)a} — bk?/s" 


p=l 


= s(r — x) — (rk — vd) = 0,7 


from (2.1). Therefore ff? = ff = --- = ff = ft = k/s. Since fZ (p = 1,2, ---,b; 
q = 1,2, --- , 8) must be integral, k is divisible by s. Thus the theorem is proved. 

It has been proved that a PBIB design with two associate classes satisfying 
the relations (i) and (ii) has a L, association scheme if s # 4((6], [9]). Using this 
result and Theorem 2.1 we have 


Corouuary 2.1.1. A necessary condition for the existence of a PBIB design with 
two associate classes having the parameters 


(2.5) v= 8’, b, wT, k, A 9 Ae ’ mn = 23s — 2, Pir =s- 2, 
where rk — v\y = s(r — \,) and s # A, is that k is divisible by s. 
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OPTIMALITY CRITERIA FOR INCOMPLETE BLOCK DESIGNS' 


By K. R. San’ 
Forest Research Institute, Dehra Dun, India 


1. Introduction and Summary. Several optimality criteria have been suggested 
for the efficiency of incomplete block designs. This note surveys these criteria, 
extends certain results and puts forward a new and simpler criterion. 


2. Existing Criteria. Important aims in experimental design are to estimate 
the effects of treatment comparisons with maximum precision for a given total 
number of experimental units, or total cost, and to perform a test of the null 
hypothesis. These two considerations lead us to different criteria for choosing 
from among the designs. 

Consider the class of incomplete block designs, D,. , for fixed values of v, k 
and b(v > k), where v treatments are arranged in b blocks of k plots each, and 
each treatment is replicated r times. In the usual notation, (see for example, 
Kempthorne [2]) intra-block estimates of treatment effects are given by 


(2.1) Ct = Q, 


where C = rI — NN’/k, N being the incidence matrix of the design. We consider 
only connected designs, so that the rank of C is v — 1. Let A: , Ao, --- , Ay, be 
the v — 1 non-zero latent roots of C. It is proved in [2] that the average variance 
of all elementary treatment contrasts is proportional to >-d;" . Let Pit(i = 1, 2, 

- ,v — 1) be any complete set of v — 1 orthogonal normalised contrasts. Set 


P = (P,, P.,--- , Pol, Pt = 9, @o = {p,*** 5 Pas}. 


It can be shown that P’CP is a non-singular matrix with latent roots \, , «++ ,A».—1, 
and that (2.1) leads to 


(2.2) P'CP6 = PQ or 6 = (P'CP) PQ. 


Let us denote the dispersion matrix of x by V(x). Now V(Q) = C-o’, which 
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gives V(@) = (P’CP) '-o°. Hence the generalised variance of @ is given by 


(2.3) | Vie) | = | (PCP) *|-0? = o JJ az. 


The usual null hypothesis Hy, is 4; = & = - = t,, which is equivalent to 
Pi = po = - = p,. = 0. The sum of squares for testing Hp is t'Q, which can be 
shown to be equal to 9’P’CP». Hence the power of the F test is a monotonically 
increasing function of 8 = 9’P’CPo/o’. 

The efficiency criteria considered so far by various authors are as follows: 

(A) If we wish to minimise the average variance of all elementary treatment 
contrasts, we should minimise }~y>" , [2], [4]. 

(B) Wald [6] argues that it is not possible to maximise power for all values of 
9. Hence we should maximise 8 for fixed values of 9’p/c’. It is reasonable to maxi- 
mise the minimum of 8 subject to 9’9/e° = constant. This leads to maximising 
Amin » [1], (6). 

(C) Wald [6] further argues that from certain mathematical considerations it 
would be simpler to minimise [ {=} \7’. This minimises the generalised variance. 
Also, as Nandi [5] has pointed out, this has the desirable effect of minimising the 
volume of equi-power ellipsoid given by 9’P’CPo/c* = const. In a sense this 
minimises the range of » subject to constant power. It should also be noted that 
the design which minimises [] 7° gives certain optimum properties for the 
usual F test associated with it, [3]. 

It is easy to see that the optima for all the criteria are reached when the \’s 
are all equal. Hence, when a balanced incomplete block design (BIB) exists in 
the class Dy, , it is the most efficient design in that class [4]. 

In [2] and [4] only the equi-replicate designs are considered. But the results 
follow from the roots of C, and the only condition used in [4] is >> A; = constant. 
Hence the results in [2] and Section 2 of [4] are valid also for the case of unequal 
number of replications. The extension of these results is not of mere academic 
interest; there are important classes of designs, such as inter and intra-group 
block designs and reinforced incomplete block designs, where the number of 
replications are usually unequal. 

Since efficiency should relate to the manner of utilization of the resources, in 
framing an efficiency criterion, it seems natural to take into account the amount 
of experimental material used. This would enable us to compare designs with 
different sizes. Hence, we consider the class of designs, D,, , for fixed values of v 
and k(v > k), where v treatments are arranged in blocks of k plots each. Denote 
by r; and R, the number of replications for the ith treatrnent and the average 
number of replications respectively. Since }> A; = Trace C = (k —1)>0 r\/k = 
(k — 1)vR/k; it is linearly related to the total number of plots. 

The efficiency criteria, analogous to those in (A), (B) and (C) would be 


(2.4) E, = (v > 1)/R> . E, = Amin/R, E; a 1/R(] r,)' M - 


Now for fixed R, the theoretical maxima of FE, , E., FE; are attained when 
hi = Ao = +++ = Aw. Since this maximising solution is independent of R, it is 
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also the unconditional maximising solution. Now in the class of designs Dy , a 
BIB design always exists. Hence, judged by any of the three critreia, within the 
class of designs D,, any of the BIB designs is the most efficient. It can be easily 
seen that for the BIB design E; = E, = Ey = (1 — 1/k)/(1 — 1/0). And as is 
to be expected, each one of them increases with k. In the limit when k = 2, i.e., 
for randomised complete block designs, 2; = E,; = FE, = 1. 


3. A New Criterion. The above three criteria are based on different considera- 
tions and need not necessarily agree in comparing two given designs. Which 
criterion should be adopted depends upon our aim in conducting the experi- 
ment. But most often we shall be interested in both the interval estimation of 
treatment effects and in the test of the null hypothesis. 

It should be noted that, in the limit when optimality is reached, all the three 
criteria lead to the same result, viz., the \’s should be all equal. In fact for the 
first and the third certeria, we are concerned with the geometric and the har- 
monic means subject to the arithmetic mean being constant. When the experi- 
ment is symmetrical, i.e., the \’s are all equal, the three means coincide. This 
suggests the use of - (A; — 4)?/(» — 1) with x hd; = const., asa criterion for 
optimality, i.e. among designs of given size, we should make). Xj as small as 
possible, subject to existence of a design. To eliminate the effect of the size of 
the design we define 


(3.1) Ey = ¥V/AR(SN/(o -— dD) = @ - 17 (SE a/R CSE ad. 
When the design is balanced, Ey = (1 — 1/k)/(1 — 1/0), and hence the effi- 


ciency of a BIB increases with k increasing, reaching unity when k = v. Never- 
theless, the criterion is suggested only for comparisons of different designs within 
the class D,, , with v and k fixed. 

Though this criterion does not agree exactly with any of the three criteria 
given above, it will tend to be as good as any of them. In any case, we are not 
able to satisfy all the three criteria simultaneously. Smaller values of >- Xj will 
tend to give smaller values of >> 7° and J] A;’, though this does not hold 
exactly in all cases. Though the contours of equal efficiency (in the space of the 
\’s) are not identical with those for the other three criteria (which themselves 
are not identical), our criterion will be quite useful. For the points on the line 
given by A; = A, = --- = A,_; all give the same result and for the class of de- 
signs with higher efficiency, i.e., for \’s not too widely spread, they will be more 
or less equal. This is the region where our criterion will be quite effective. As 
shown below this criterion has the advantages of simplicity and practical use- 
fulness. 

We can express C as >> A,L,L; , where L, is the canonical vector corresponding 
to \,. This immediately gives C’ = (SALL)(SaALL) = ¥ LL. 
Hence Trace C’ = > Xj, but Trace C’ = >>; 5°, ci;, and therefore >> 3 = 


>: do; ¢7; . Hence, 
E, - (v pone 1) *((k a 1)/k)*PR/ Li Cs , 
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A further simplification can be had for PBIB and circulant designs, where 
>>; ci; is the same for all 7. 

For the other three criteria, elegant expressions are seldom available. Since 
E, follows directly from the C matrix it is easiest to compute; we do not have to 
solve the normal equations or evaluate the \’s. 


Acknowledgment. My sincere thanks are due to Dr. K. R. Nair for his guid- 
ance in writing this paper. 


REFERENCES 


[1] Sytvain Enrenrecp, “On the efficiency of experimental designs,’ Ann. Math. Stal., 
Vol. 26 (1955), pp. 247-255. 

[2{ Oscar Kempruorng, “The efficiency factor of an incomplete block design,” Ann. Math. 
Stat., Vol. 27 (1956), pp. 846-849. 

[3] J. Kieren, “On the non-randomised optimality and randomised non-optimality of 
symmetrix design,” Ann. Math. Stat., Vol. 29 (1958), pp. 675-699. 

[4) A. M. Ksurrsacar, “A note on incomplete block designs,” Ann. Math. Stat., Vol. 29 
(1958), pp. 907-910. 

[5] H. K. Nanp1, “On the efficiency of experimental designs,” Calcutta Stat. Assoc. Bull., 
Vol. 3 (1950), pp. 167-171. 

[6] ApranAM WALD, “On the efficient design of statistical investigations,” Ann. Math. 
Stat., Vol. 14 (1943), pp. 134-140. 


a rr 


ON THE COMPLETENESS OF ORDER STATISTICS' 
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University of California, La Jolla, and San Diego State College 
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Davip BLACKWELL AND Leo BREIMAN 
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1. Introduction and summary. Let X,, X:, --- , X, be a sample of a one- 
dimensional random variable X; let the order statistic T(X,, X:, --- , X,) be 
defined in such a manner that T(x , 22, --: ,2n) = (2,2, --- , 2”) where 
ze” s 2” s --- s x™ denote the ordered 2’s; and let @ be a class of one- 
dimensional cpf’s, i.e., cumulative probability functions. 

The order statistic, 7, is said to be a complete statistic with respect to the 


class, {P“” | P © Q), of n-fold power probability distributions if 
Ept») {h{T(X,, --- , X,)]} = 0 
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for all P ¢ Q implies A{T(x , --- , t.)] = 0, a.e., P', for all F ¢Q. The class 
© is said to be symmetrically complete whenever the latter condition holds. 

Since the completeness of the order statistic plays an essential role in non- 
parametric estimation and hypothesis testing, e.g., Fraser [2] and Bell {1}, it is 
of interest to determine those classes of cpf’s for which the order statistic is 
complete. 

Many of the traditionally studied classes of cpf’s on the real line are known 
to be symmetrically complete, e.g., all continuous cpf’s ([4], pp. 131-134, 152- 
153); all cpf’s absolutely continuous with respect to Lebesgue measure ({3], 
pp. 23-31); and all exponentials of a certain form ({4], pp. 131-134). 

The object of this note is to present a different ((4], pp. 131-134, 152-153) 
demonstration of the symmetric completeness of the class of all continuous 
cpf’s; and to extend this and other known completeness results to probability 
spaces other than the real line, e.g., Fraser [2], and Lehmann and Scheffé [5], 
(6). 

The paper is divided into four sections. Section 1 contains the introduction 
and summary. In Section 2 the notation and terminology are introduced. The 
main theorem is presented in Section 3, and some consequences of the proof of 
the main theorem and known results are indicated in Section 4. 


2. Terminology and notation. Let (X, 8) be an arbitrary measurable space; 


\, an arbitrary measure on (X, $); and Q, a class of probability measures on 
(X, $). 


Consistent with the notation of Scheffé [7] one defines the following sets and 


classes. 


%(X) = the class of all probability measures on (X, $8); 
,(X) = the class of all nondegenerate probability measures on (X, 8); 
(X) = the class of all nonatomic probability measures on (X, 8); 
(A) = {[P ¢€Q&(X)|P «XI, ie., the class of probability measures abso- 
lutely continuous with respect to i; 
2(35, 4) = {Aa | A € Ko} where 3.7 = {A €3|0<A(A) < ~} and dA,(C) = 
\(AC)/XMA) for all C € 8; 
Me = [A e$| P(A) = O for all P ¢Q}, i.e., the null class of Q; 
(x, s™) = the product n-space generated by (X, 8); 
\” = \x --- x = the n-fold power measure on (X“”’, 8°”) generated by A; 
a” = {P™ | P ¢Q} = class of power measures generated by 0; 
Non) = {A ce 8” | P™ (A) = O for all P ¢ O} = null class of 2. 
A class @ is said to be symmetrically complete for n = k if hy = O[P"’) ie., 
h, = 0 a.e. with respect to P“’, for all P ¢ Q, whenever h, satisfies 
(a) Ay is a symmetric function [measurable on (X“, 5” )]; and 
(b) fae dP™ = 0 for all P € Q. 
With this notation we now demonstrate that the class 2,(X) is symmetrically 
complete for all n. 
In the sequel it will be assumed that » is an arbitrary fixed nonatomic prob- 
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ability measure on (X, 8); that A, is a symmetric measurable function on 
(xX, 8); and that @ is a semi-algebra which generates §. [Note: @ is a semi- 
algebra if X ¢ @; @ is closed under finite intersections; and A, B ¢ @ with A C B 
implies the existence of {Ao, A1, «+: , Am} C @ such that A = Ap C A; C 
-*> C Aw = Band A; — Ay € @ fori = 1, 2, «++, m.] 


3. The main theorem. The proof of the main theorem utilizes facts that 
Q(@, y) is symmetrically complete for properly chosen @ C §; ihat the null 
classes of 2” (@, P;) and 2," (P;) are equal; that, therefore, 2,(/°,) is sym- 
metrically complete; and that so is 2,(X), since it is the union of classes 2;(P). 

These ideas are given more precisely by the following three lemmas. 

Lemma 1. (Fraser) If y is an arbitrary nonatomic probability measure on (X, $) 
and @ is a semi-algebra which generates $, then 2(@, y) is symmetrically complete 
for all n. 

Proor. See Fraser [2]. 

Lemma 2. If Pye 2(X), then Tlat)(e,P3) = Ta, (*)(P4) for all n. 

Proor. Let n be an arbitrary fixed positive integer. Clearly, Pi’ (A) = 
implies P(A) = 0 for all P ¢ Q,(P,). This latter condition implies N,p,.), C 
No,(*):p,). On the other hand, since 


Pi” ¢2™(@, Pi) C 23" (P)), Ny pe) D Nawra.P:) D Nar, - 


The conclusion follows immediately. 

The symmetric completeness of 2(@, P;) and the equality of the two null 
classes are sufficient to establish the next lemma. 

Lemma 3. If P, ¢ Q(X), then 2;( P:) is symmetrically complete for all n. 

Proor. fh,dP’” = 0 for all P ¢2(P;) implies P{h, + 0} = 0 for 
Pe 2( 4, P,) cn 23(P). Hence fh, # 0} & TMat*)@.P3) = Ta,(*)P,) and hn 
oP”) for all P e Q;(P;). 

The main theorem now follows from the preceding lemmas and the fact that 
any measure absolutely continuous with respect to a nonatomic measure is itself 
nonatomic. 

MAIN THEOREM. The class 9.(X) of all nonatomic probability measures on an 
arbitrary measurable space (X, $) is a symmetrically complete class for all n. In 
particular, the class 2, of all continuous cpf’s on the real line is a symmetrically 
complete class for all n. 

Proor. It is sufficient to demonstrate that for arbitrary fixed n, and arbitrary 
fixed P,; ¢ %(X), Pi" {h, = 0} = 0, whenever h, is a measurable symmetric 
function with the property: fh, dP“ = 0 for all P ¢ 0,(X). 

Under such circumstances it is clear that 2;(P;) C 2:(X). Therefore, Lemma 
3 guarantees for symmetric h, such that fh, dP" = 0 for all P © 0(X), that 
Pth, # 0} = 0 for all P £,(P,). But P, ¢(P,) and, consequently, 
P\"{h, ¥ 0} = 0. 


4. Extensions. The symmetric completeness of several other classes of sta- 
tistical interest can be extended to abstract spaces. In fact, by an extension of 
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the ideas above and those of Fraser ({2],{3], pp. 23-31), one can demonstrate the 
following result. 

Turorem. If (X, 8) is an arbitrary measurable space, then (1) %(X), 2(X) 
and %(X) are symmetrically complete for all n. 

If, further, \ is a nonatomic, o-finile measure on $ and @ is a semialgebra which 
generates $, then, (11) Q(@, X), (8, A) and Q,(A) are symmetrically complete for 
all n. 

REFERENCES 

{1] C. B. Brut, “On the structure of distribution-free statistics,’’ Ann. Math. Stat., Vol. 
31 (1960), pp. 703-709. 

[2] D. A. 8. Fraser, “Completeness of order statistics,’’ Can. J. Math., Vol. 6 (1954), pp. 
42-45. 

[3] D. A. 8. Fraser, Non-Parametric Methods in Statistics, John Wiley and Sons, New York, 
1957. 

[4] E. L. Lenmann, Testing Statistical Hypotheses, John Wiley and Sons, New York, 
1959. 

(5) E. L. Leumann ano Henry Scuerré, ‘Completeness, similar regions and unbiased 
estimation,’’ Part I, Sankhyd, Vol. 10 (1950), pp. 305-340. 

[6] E. L. Leumann anp Henry Scuerré, ‘‘Completeness, similar regions and unbiased 
estimation,’’ Part Il, Sankhyd, Vol. 15 (1955), pp. 219-236. 

[7] H. Scnerré, ‘On a measure problem arising in the theory of non-parametric tests,”’ 
Ann. Math. Stat., Vol. 14 (1943), pp. 227-233. 


(a 


ON CENTERING INFINITELY DIVISIBLE PROCESSES' 


By Ronaup Pyke 
Stanford Unwwersity and Columbia University 


The concept of centering stochastic processes having independent increments, 
introduced by Lévy, is applied to processes having both stationary and inde- 
pendent increments. The main purpose of this note is to answer the question as 
to what centering functions preserve the stationarity of the increments. 

In 1934, Lévy [1] proved that any stochastic process with independent incre- 
ments may be transformed by subtraction of a sure function, called a centering 
function, into a process whose sample functions possess certain desirable smooth- 
ness properties. (cf. Lévy [2] and Doob [3]). It is clear that the transformed 
process, called the centered process, is also a process possessing independent 
increments. The purpose of this paper is to show that a process having stationary 
and independent increments may be centered in such a way so as to preserve 
the stationarity as well as the independence of the increments. 

To be more precise, consider the following definitions (cf. Doob [3] p. 407). 
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For a set T C R,, let 7* denote the set of limit points of T except that the 
supremum and infimum of 7 are to be included in 7* only if they belong to 7. 
DeriniTion 1: A stochastic process {X,:t ¢ T} is said to be centered if and 
only if 
(a) for every {t,} C T satisfying t, 7t e T*(t, st e T*) there exists a random 
variable X,(X,,), independent of the particular sequence, such that 


Xx, as. X., (Xe a.s. X,) 


(b) there exists a function g defined and continuous on the closure of 7’ such 
that any difference X, — X,, t, s ¢ T*, or any such difference with ¢ replaced 
by t+ or t— and/or s replaced by s+ or s—, is constant a.s. if and only if 


X,— X, = g(t) — g(s)as. 


(c) X,_ = X, = X14 as. for all but at most a countable number of points 
of T. 

This definition differs from that given by Doob only through condition (b). 
In Doob’s definition, the function g was restricted to be constant over T*. The 
above modified definition has the advantage of making it unnecessary to dis- 
tinguish between degenerate and non-degenerate processes in the theorems 
below, as well as of insuring the truth of the statement that if {X,:t¢ 7} isa 
centered process, then so is {X, + h(t):t e¢ T} for every continuous and bounded 
function A on T.. This statement is not true under the more restrictive definition 
of Doob. 

DerIniTion 2: A function c:7T — R, is said to be a centering function of a 
stochastic process {X,:t¢T7} if and only if the process {X, — c(t):teT} is 
centered. 

It is clear that one may always find a centering function, such that the result- 
ing centered process satisfies (b) of Definition 1 with g = 0. 

A stochastic process, {X,:t « T}, having stationary and independent incre- 
ments and for which T = [0, +) and X» = 0 a.s. is said to be an Infinitely 
Divisible (I.D.) process. As is evident from Lemma 1 below, a correspondence 
may be defined in a natural way between the class of infinitely divisible random 
variables (r.v.) and the class of I.D. processes. For properties of infinitely di- 
visible r.v.’s used in this paper, the reader is referred to [4] and [5]. 

In the case of a centering function for processes with independent increments, 
uniqueness is clearly impossible. One possible centering function is that used by 
Doob ([3], p. 408), namely the solution to Z {arctan [X, — c(t)]} = 0. It should 
be noted that this particular centering function would not preserve stationarity 
of increments in case the given process were a non-degenerate I.D. process. 

Define for all w ¢ R; and t = 0, f(w:t) = Efe*. 

Lemma 1: A stochastic process |X,:t 2 0} having independent increments is an 
I.D. process if and only if there exist unique functions c:[0, ©) + R, and y:R; > 
complex plane, satisfying (i) for all s, t = 0, e(s) + c(t) = e(s + ft), (ii) for 
all rational r = 0, c(r) = 0, (iii) forallt => 0,w eR, , log f(w:t) = twe(t) + (we). 
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Proor: The proof of the sufficiency is left to the reader. The main interest 
is in the necessity of these conditions. A straightforward proof of this is possible 
using the Lévy-Khintchine representation of f(w:t), namely 


(1) log f(w:t) = top(t) + [ (e* — 1 — twa/(1 + 2*))(1 + 2°)/2° dG(z:t). 


where G(-:t) is a bounded non-decreasing right-continuous function, since 
clearly 


(2) f(wis + t) = f(wis)f(w:t) 


The purpose of the proof given here is to demonstrate that the powerful tool 
(1) is not essential for proving the necessity of the conditions of Lemma 1. 
This is important, it is felt, because the result stated as Lemma 1 should logi- 
cally be proven very shortly after an I.D. process is defined, and because such 
a definition may well precede any discussion of infinitely divisible r.v.’s. 

For each n, [f(w:t)]'"" is a characteristic function. It is well known, and easily 
proven, that therefore, for all p > 0, [f(w:t)]”, properly defined, is a character- 
istic function and that for all w ¢ R; and t 2 0, f(w:t) # 0. Because of (2), 
| f(w:s)| | f(w:t)| = | f(wis + t)| . Since 0 < | f(w, s)| S 1, the solution of this 
functional equation is given by 2 log | f(w:t)| = tly(w) + ¥(—w)] where ¥(w) = 
log f(w:1). Consequently, upon defining q(w:t) = e[f(w:t)]”, it follows 
that | q(w:t)| = 1, and that g(-:¢) is a continuous function for each ¢ 2 0. 
Moreover, since for rational r, f(w:rt) = [f(w:t)]', one has 


q(w:t) = lime [f(w:t)J-" = lim [f(w:1)/f(w:rt)}' 
where the limit is taken as r 7 f"' over the rationals. However, by (2), 
f(w:1)/f(eirt) = f(w:l — rt) 


is a characteristic function and hence so is q(w:t). Because | q(w:t)| = 1, the 
proof is then complete since for each t 2 0, q(w:t) = e“” for some real number 
c(t). It may be easily checked that the function ¢ and ¥ thus defined satisfy 
the required conditions. 

By using the function ¢ in Lemma 1, one obtains 

Coro.iary 1: For an 1.D. process |X,:t 2 0}, there exists a centering function 
c such that the resulting centered process |X, — c(t):t 2 0} is also an I.D. process. 

It is remarked that a stationarity preserving centering function for an I.D. 
process is unique up to the addition of straight lines through the origin. 

It is evident that portions of Definition 1 are superfluous when applied to 
I.D. processes. In fact, one can easily prove 

Lemma 2: An 1.D. process is centered if and only if its characteristic function 
f(w:t) is continuous in t. 

Coro.Luary 2: An I.D. process is centered if and only if for all sequences 


a.s. 


0s t.—t, X., ah 
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It is emphasized that the above results are neither difficult nor too surprising. 
The fact that a centered 1.D. process has a characteristic function which satisfies 
log f(w:t) = t)(w) is well known (e.g., cf. Lévy [2] p. 186, Doob [3] p. 419, 
Ito {6]). The justification for the presentation of the above material is two-fold; 

(i) Corollary 1 has not been located in the literature and 

(ii) several recent papers in the literature indicate that Lemma 1 and Corol- 
lary 1 are not known. 

Concerning (ii), several authors assume that (8) is true for all separable 1.D. 
processes (cf. [7], [11]) while in other papers the exact role played by centering 
in the case of I1.D. processes seems to have been misunderstood (cf. [8]). Further- 
more, as a consequence of Lemma 1, the assumption (retaining the notation of 
the papers referred to) that ¢(t:\) be continuous in \ may be removed from 
Theorem 1 of [9] and from Theorem 1 of [10]. For example Theorem 1 (iii) of 
[9] could be strengthened to read: F(x:\) ¢ C, if and only if (t:4) = [f(t)Pe” 
where f(t) is a characteristic function and where c is a function satisfying the 
conditions of Lemma 1. 

As mentioned in the above paragraph separability is sometimes thought to 
imply that a process is centered. Although this is not true, it is possible to relate 
these two properties as well as the properties of measurability and of bounded- 
ness of sample functions, as stated in 

Lemma 3: For a separable 1.D. process X = |X,:t 2 0}, the following condi- 
tions are equivalent: (i) X is centered, (ii) X is measurable, (iii) there exists a 
separating sequence which is a subset of the rational numbers, (iv) there exists an 


open interval in (0, ©) over which almost all sample functions are either bounded 
from above or below. 
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THE STRONG LAW OF LARGE NUMBERS FOR A CLASS OF 
MARKOV CHAINS 


By Leo Breman' 
University of California at Los Angeles 


1. Introduction. The following problem has arisen in the study of Markov 
chains of the learning model type. (See [1] for definitions). Let the state space 
be, for example, the unit interval [0, 1] and let the chain have a unique invariant 
initial distribution x(dz). Now let the chain be started at some point z « [0, 1); 
is it true that 

1 N 

(1) N d X,— E,X, as? 

From the ergodic theorem we know that there is a set S C [0, 1] such that 
x(S) = 1, and, if ze S, then (1) holds. In learning models, however, x may be 
singular with respect to Lesbesgue measure, so a stronger result is desirable. We 
prove for a wide class of chains, including learning models, that (1) holds for 
every possible starting point. This result is well known for chains satisfying 
Doeblin’s condition. Unfortunately, learning models do not. 


2. The theorem. Let the state space 2 be a compact Hausdorf space, and 
®@ the Baire o-field in 2. The Markov transition probabilities P(A | z) are as- 
sumed probabilities on @ for fixed z, @-measurable functions on © for fixed A, 
and such that there is a unique probability x on @ satisfying 


A) = [Pu | z)#(dz), all A e @. 


Let C be the class of all continuous functions on Q, and add the final restriction 
that, if f eC, so is E(f(X,) | Xo = x). Let 2 be the infinite sequence space 
with coordinates in @. In the usual way, we construct a o-field @” in & and, 
using the initial distribution X) = z, a probability P, on @~’. Then 

TueoremM. Let @ ¢ C, Then, for any x € Q, 


¥ > o(X,) > E,®(X;) as. P,. 


Proor. The proof of this theorem is a combination of the Kakutani-Yosida 
norms ergodic lemma and an argument concerning conditional probabilities. 


3. The topological part. We prove first a proposition which summarizes the 
topological ergodic theorem we need. Define the operator T on C into C by 
(T¢)(z) = E(¢(X,) | Xo = x), so that (7"*¢)(z) = E(¢(X,) | Xo = 2), and 
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set Tv = >of T"o/N. Then 
Proposition 1. For any ¢¢€C, T xo converges uniformly to £,¢(X;). 
Proor. This proposition and its proof are well known in linear space theory. 
However, for completeness, we give a short demonstration. Let 9M be the class 
of probability measures on ®, and consider the operators V, Vy on IN into M 
defined by 


(VQ)(A) = | P(A|2)Q(d2), PQ = D V'QIN. 


By the Helly-Bray theorem, J2 is closed and compact in the weak dual topology, 
so that there are plenty of convergent subsequences Vy,Q. But every limit point 
of VyQ is invariant under V and hence is identified with 2, so that VyQ — x in 
our topology. Therefore, for every Q ¢ I, and @ e C we have 

(Tx, Q) = (¢, VrQ) — (4, =) 


and hence T'y@ converges weakly to E,@. Applying the Kakutani-Yosida norms 
ergodic lemma (see, for example, [2], pg. 441), we conclude that 7T'y¢ converges 
uniformly to E,¢. 


4. The probabilistic part. Let X,, X;.,--- be distributed according to P, , 
and define 


ay (0(Xa) — B(O(Xn) | Xn), 
Z. ™3 
\0, 


Zz” - cia | Xai) — E(o(Xx) | Xe-a), 
mi, 


PROPOSITION 2. 2 zi, —0 as. P,. 
Proor. We use the following result ({2], pg. 387). Let Yi, Y2,--- be a se- 
quence of random variables such that A(Y,| Yai,-+-, ¥:) = 0 and 


EY. <M < ~, 


all n. Then 

l N 

<= Y,— 8. 

NV u 0 as 
To apply this, note that 
E(Z.” | 21, --», Zf) 

= E(E(Ze | Xue, Xn+i,+++,X1)| Zea, ---, Zi”), 
and that, since the X, , X,, --- form a Markov chain, 
E(Z | Xn+,.-) = E(Z@ | Xu) = 0. 


Further, £(Z<)* < 2(sup |¢| )*, thus giving the proposition. 
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5. Conclusion of the proof. To complete the demonstration of the theorem, 
write 
o(X,) pe E(o(X,) | Xn) = ZY + " + +++ + ae n> k. 
Thus, by proposition 2, 


N N | 
tim | > 6(xX.) —1+ > B(6(X.)|Xe-a)| = 0, as Po. 
xn |N Gen RN eat 


+1 


Or, neglecting at most k terms, 


x N j 
im | + 3 6(X.) — LY B(o(Xess) | Xe)| = 0, a0Pp, 
N N n=l N n=) 


so that, for fixed M, 


; ly N N M | 
lim 5 2d (Xn) — EE ay BloCXae) Xe) | = 0, as. P,. 


By proposition 1, for any « > 0, we may choose M such that 
(1; 
ML E(o( Kure) | Xn) vr E,o(X)) | s €, 


and for such an M we have 


N 
N N n=l ' 


proving the theorem. 
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EMPTINESS IN THE FINITE DAM 
By A. GxHosaL 


Central Fuel Research Institute, Dhanbad, India 


1. Summary: The paper discusses the general problem of emptiness in the 
finite dam and considers the probability that, starting with an arbitrary storage, 
the dam dries up before it fills completely. Some exact results are given both for 
discrete and continuous inputs. An interesting relation between this probability 
and the asymptotic distribution function of the dam content has also been ob- 
tained. 
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2. Introduction: This paper is based on the storage system model given by 
Moran [5]. The storage, Z, , of a dam of finite capacity, k, is defined for discrete 
time ¢ (t = 0, 1, 2, ---) as the dam content just after an instantaneous release 
at time ¢, and just before an input, X,, flows into it over the time interval 
(t, t+ 1). The model is subject to the conditions 

(i) the inputs X, during the intervals (t, + 1) are independently and identi- 
cally distributed ; 

(ii) there is an overflow, max (Z, + X, — k, 0), during the interval (t, ¢ + 1), 
while min (k, Z, + X,) is left in the dam just before the release occurs; 

(iii) the amount of water released at time ¢ + 1 is min (m, Z, + X,), where 
m is a constant < k. 

It has been shown that the processes (Z,) and (Z, + X;,) are both Markov 
chains, and the problem of obtaining their stationary distributions has been 
dealt with by Moran [5], [6], Gani [2], Gani and Prabhu [3] and Prabhu {7}, {8}. 

This paper deals with the problem of finding the probability that, given an 
arbitrary initial storage and the distribution of the input (X,), the dam dries 
up before it fills completely. It also shows that this probability bears an elegant 
relationship with the asymptotic distribution function of the dam content. D. 
G. Kendall [4] derived the time required by an infinite dam to dry up; Prabhu 
[8] dealt with the probability of emptiness at a given storage level for the finite 
dam, but for m = 1. Here, the problem for any m is dealt with. 


3. The Probability of Emptiness—Discrete Input: If the release rules given in 
Section 2 operate, we have 


ifm<Z,+X,<k, 


(1) Z = if Z: + Xe = m, 
if Z, + X, 2 k. 


Let {g;} be the probability distribution of X, , so that 
(2) Pr {X. = j} = 93, (j = 0,1, ---). 


Let V; be the conditional probability that, starting with storage 7, the dam 
becomes empty before it fills completely. It is easy to derive the following: 
m—* k—m—1 
2 H+ 2 GismrV, (i S m), 
j= i= 
(3) Vi= 


k—m—1 
| Do 04m V5 (m<isk—m-—1). 


j=t—™m 


We note that the states 0 and k — m are absorbing, so that V; = 1 fori < 0, 
Vi-wsr = Oforr 2 O. 
3.1. Geometric Input: Consider an input distribution of the geometric type, 


(4) g; = ab’ (b= 1—a,j =0,1,---). 
Applying the transformation V; = 1 — bg; to (3), after substituting (4) 
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in (3), we get 


(a 
= 1—m—1 


(5) Olek. Sy * 


( j=l 


ab”, 
k—m-1 
a bY + ¢;. 
j=l 


We solve for ¢; successively for the ranges (m, 2m), (2m, 3m), --- in terms 
of the unknown constant a. For instance, for m < i S 2m, we get 


o; = afl — A(i — m — 1)). 
Let k = (N + 1)m + U, where 0 S U < m. We get the general expression 
. _— ery _sve(im— qm— i (nm <is (n+ 1)m, 
(7) = ad ( ro ( q ), n=0,1,---N+1). 
We solve for a from (6) and (7): 
n+l + a 

(8) a=} [> (—)* (' = 'y]. 
q=0 q 

From (7) we have 


(9) V; =1—ab‘ >> (—nx)* 


q@=0 


= C379 (nm <is (n+ 1)m; 
q 


n=0,1,---N). 
In many cases, it may be enough to know the bounds within which V, should 
lie, and these bounds are given by Feller ({1], inequalities 8.11, 8.12 on p. 303). 
Prabhu [8} has obtained the bounds for m = 1. For general m, if we put E(U,) = 
E(X,— m) = p — m, where p is the mean input, we have 
(Zo — Ze") /(Z' - 1) s Vis 1 (p< m), 
(10) (Zo*"* — Z*)/1-Z)s Vis Z% (p > m), 
1— (m+i-1)/(kK-1)8Vi81 
where Z» is the unique positive root (other than unity) of the equation 
>, 2’ Pr (U, = j) = lie. Fie Z’g; = Z”, and Z» 2 1 according as p 2 m. 
4. Continuous Input: It would be instructive to study the continuous analogue 
of (3). If V (y) is the continuous analogue of V, , the equations (3) become 


(p = m), 


k—m 


il a 8 aad = ( k 
[ Vit) dG(t + m — y) msy< 


vy" == m), 


where G(z) = Pr (X, S 2). 
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4.1. Exponential Input: Consider an exponential input of the type 
(12) dG(z) = pe” dz, (O<2< w~;y>0). 
By applying the transformation V(t) = 1 — e“$(t) and substituting (12) in 
(11), we get 


a (y sm), 


(13) o(y) = y—m 
, a-r/ o(t) dt (y>m), 
0 


where 


—um 


X= we 


k—m 
(14) a=e"+y | o(t) dt. 
0 
Suppose k = (N + 1)m + U, where 0 S U < m. We can solve for ¢(y) suc- 
cessively for the ranges (m, 2m), (2m, 3m), ete. 
For nm < y S (n + 1)m, we have 


(15) oy) = aX (-yt P= (n= 01,2 N +1). 


a is determined as follows: 


k—m 
a=e* 4 rf o(t) dt 
0 


* eal [ow dt + -:- OL 


so that 


N+1 inal qa 
(16) ano /y CN am 


q=0 q ! 


Thus, we have 


(1— ea (y sm), 


(17) Vy) =: 1 — ee > (- nye Y= am qm)* (nm <y S (n+ oom 


| q! n=0,1,---N). 


We have the boundary conditions: V(0) = 1, Vik — m+ pr) = Oforr 2 0. 
From (17) we find V(+0) = 1 — aand V(k — m — 0) > 0, indicating that 
there are points of discontinuities at y = 0 and y = k — m. 

4.2. Gamma Input: Consider a gamma input 


(18) dG(z) = (u?/(p — 1)ie“2? dx (O< 2 < mp >0;p=1,2,:-- 
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Again, applying the transformation V(t) = 1 — eo(t) to (11), we get 


a, y 


' 
=O YY! 


(y Sm), 
(19) o(y) = _ 
-—af 40 222 di (y>m), 


2, ¥! 
where 
(20) A= (—1)" "ye, 
‘ ra (ut Bs Hy ttm) 
S pk Y Po * sil 7 
a, =e "(=n)" 2 + wer’ vf WO C= 5th” 
(y = 0,1,--- p- 1). 


(21) 


We get 


ye qm)”** (nm <y S (n+ 1)m; 
(22) $(y) = Ea, 3 (— r) or oie 


where k = (N + 1)m + U, as in a 4.1. 

Prabhu [7] obtained (22) while deriving the distribution of dam storage. The 
a,’s can be obtained by his method ((7], eqn. (13)). 

Finally, we have 


(23) Vy) = 


a e qm)*”** (nm < y 
| oF ad (- ys eal Ss (n+ 1)m,n2 0). 


V(y) has two points of discontinuity at y = 0, y = k — m since the boundary 
conditions are V(0) = 1, Vik — m+ r) = Oforr 2 0. 


5. Relationship with the Asymptotic Distribution of Dam Content: If H (y) 
is the stationary c.d.f. of the dam content Z, + X, we get the following integral 
equation ({7], eqn. (2)) for continuous input: 


— [7° 1 ag(m + y- (y<k—m), 


1-e ov (y Sm), 


yd Y 


(24) H(y) = ‘ 
Gly —k-+m) — [ H(t) dG(m+y—2) (y2k—m). 


By applying the transformation H(k — y) = 1 — e"$(y) to the above, for 
exponential and gamma inputs, (12) and (18), we obtain the same integral 
equations in ¢(y), (13) and (19), as were obtained by applying the transforma- 
tion V(y) = 1 — e$(y) to the integral equations for V(y). We, therefore, 
obtain 


(25) Viy) = H(k — y). 
We may verify that (25) holds good for discrete input also. 
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CORRECTION NOTES 
CORRECTION TO 


“THE INDIVIDUAL ERGODIC THEOREM 
OF INFORMATION THEORY” 


By Leo Breiman 
University of California at Los Angeles 


Mr. James Abbott has pointed out that the argument on Page 811 of the above- 
cited work, Ann. Math. Stat., Vol. 28, No. 3 (1957), pp. 809-811, is incorrect. 
The results of the paper are valid, however, and Page 811 may be replaced by 
the following discussion. 

Note that 


n( 2 7 se ’ rt) Ben ***y t+) Ss P(ta41, **° ’ z) 
p(z-x, ata » Zo) P( 2-241, oZ » Xe) 
with probability one. By the concavity of log, it follows that the g, sequence, 


p(z-s, —+28}) 


pP(Z4,***, 2.) 


= — logs ( 


satisfies 
E(ge | 20, °*** , teat) S Gen. 
Since g, 2 0, and Eg, < ~, the g sequence forms a non-negative lower semi- 
martingale and hence converges a.s. Actually, the convergence of the g, sequence 
has been previously established by McMillan in [2]. 
Now consider P(supisn ge > A), and define the disjoint sets 
E; = {g; > dr, supecs ge S Aj, 


whence P(supi<n gi > A) = > te P(E;). Let Z; be the cylinder sets {z» = a,j 
and fi” the functions — log, P(ap = a;|z4,°°:, 2+). If Dos f(a, 2-1,°°* ) 
indicates the sum of f(z», z1,°-- ) overall sequences (z», z,,--- ) € A, then 


P(E) = ples, ---,m) =D Me) ye... 24), 
Ej 7 wifes P(Z-4, *** , Za) 
But on EF; we have the inequality 
p(2z-;, eee, Xo) on a % s > 
p(z;, vos Ra) . 


leading to 


PEs) S 2°) De plas +++ 2a) = 2D PUP > d, sup fi? 2). 


jfiz 
809 
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Finally, then, 


P (sup g:>d) S 2° > P (sup fi®>a) s s-2™, 
kgn ‘ kgn 


where s is the number of values that the process ranges over. This last inequality 
gives P(sup: gx > ) S 8-2”, which quickly leads to E(sup, g) < @. 


—— 
CORRECTION TO 


“BOUNDS ON NORMAL APPROXIMATIONS TO STUDENT’S AND 
THE CHI-SQUARE DISTRIBUTIONS” 
By Davip L. WaLLAcE 
Universtiy of Chicago 


The following correction should be made on p. 1127 of the above-titled article 
(Ann. Math. Stat., Vol. 30 (1959), pp. 1121-1130) : In the conclusion of Corollary 
2 to Theorem 4.2, the exponent of n should be —4 and not }4. 


I 





ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Stanford Annual Meeting of the Institute, 
August 23-26, 1960. Additional abstracts will appear in the December 1960 issue.) 


1. Estimating the Infinitesimal Generator of a Finite State Continuous Time 
Markov Process. ArtHur ALBERT. 


Let {Z(t), t > 0} be a separable, continuous time Markov Process with stationary transi- 
tion probabilities P;;(t), i, 7 = 1,2, --- , M. Under suitable regularity conditions, the ma- 
trix of transition probabilities, P(t), can be expressed in the form P(t) = exp (Q, where Q 
is an M X M matrix and is called the “‘infintesimal generator” for the process. 

In this paper, a density on the space of sample functions over [0, t) is constructed. This 
density depends upon Q. If Q is unknown, the maximum likelihood estimate 


Olk, t) = ask, Ol, 


based upon k independent realizations of the process over [0, t) can be derived. If each 
state has positive probability of being occupied during [0, t) and if the number of inde- 
pendent observations, k, grows large (t held fixed), then 4;; is strongly consistent and the 
joint distribution of the set {(%)*#(Gi; — qi;)}s=s (suitably normalized), is asymptotically 
normal with zero mean and covariance equal to the identity matrix. If k is held fixed (at 
one, say) and if ¢ grows large, then 4; is again strongly consistent and the joint distribu- 
tion of the set {(t)*#(qi; — qis)}cs; (suitably normalized), is asymptotically normal with 
zero mean and covariance equal to the identity matrix, provided that the process 
{Z(t) t > 0) is metrically transitive (but not necessarily stationary) and has no transient 
states. 
The asymptotic variances of the 9,; are computed in both cases. 


2. The Sequential Design of Experiments for Infinitely Many States of Nature. 
ArtTuur ALBERT. (By title) 


In a recent paper (Ann. Math. Stat. Vol. 30 (1959), pp. 755-770) Chernoff discussed a 
problem which he called ‘“The Sequential Design of Experiments” as it applied to the two 
action (hypothesis testing) case. In that paper, a procedure was exhibited for which the 
risk was approximately —c log c/I(@), when @ is the true state of nature, ](@) is an appro- 
priately defined information number and c, the cost per experimental trial, is small. It was 
also shown that in order for some other procedure to do significantly better for some value 
of the parameter, it must do worse by an order of magnitude (as c — 0) at some other value 
of the parameter. These results were obtained under the assumption that the parameter 
space is finite. In the present paper, the assumption of finiteness is dispensed with. The 
procedures proposed here are closely akin to Chernoff’s procedure, and analogous (though 
slightly weaker) optimality properties are derived. 


3. Maximal Independent Stochastic Processes. C. B. Beit, University of Cali- 
fornia, Berkeley. (By title) 


R. Pyke (1958) asked: What is the maximum cardinality, M, , of a family of independent 
random variables defined on an abstract space Q of cardinality a? (1) Fora < &, , an ele- 
mentary counting process yields M, = [log: a]. (2) For a = &, , a construction and a result 
of E. Marczewski (Colloq. Math., 1955) yield My, = Mo. (3) Me = 2° follows from a result 


811 
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of Kakutani and Oxtoby (Ann. Math., 1950) for the real line. (4) Fora 2 C*, one notes 
that a subset of @2 has the cardinality of a cartesian product of n real lines. Consequently, 
an elementary construction provides M, 2 n-2°. (5) Following the method mentioned in 
(4) above and using the Generalized Continuum Hypothesis it is established that My, 2 
max [M. , &--:] for ordinals r. Open problem: Can the Kakutani-Oxtoby construction be 
generalized to yield M, = 2* for alla > @? 


4. The Covariance Function of a Simple Trunk Group, with Applications to 
Traffic Measurement. V. E. Benes, Bell Telephone Laboratories and 
Dartmouth College. (By title) 


Erlang’s classical model for telephone traffic is considered: N trunks, calls arriving in a 
Poisson process, and negative exponential holding-times. Let N (¢) be the number of trunks 
in use at ¢t. An explicit formula for the covariance R(-) of N(-) in terms of the character- 
istic values of the transition matrix of the Markov process N(-) is obtained. Also, R(-) 
is expressed purely in terms of constants and the “‘recovery function,” i.e., the transition 
probability Pr{N(t) = N | N(O) = N}. R(-) is accurately approximated by R(O) exp {rit}, 
where r; is the largest negative characteristic value, itself well approximated (underesti- 
mated) by —E{N(-)}/R(O). Exact and approximate formulas for sampling error in traffic 
measurement are deduced from these results. 


5. Limiting Distribution of the Maximum in an Infinite Sequence of Exchange- 
able Random Variables. Simson M. Berman, Columbia University. (By 
title) 


Let |X,:n = 1,2, ---} be an infinite sequence of exchangeable random variables (r.v.’s), 
i.e., the joint distribution function (d.f.) of any m of these r.v.’s does not depend on their 
subscripts but only on their number m. The limiting d.f. of Z, = max (X,; , X:, --- , Xx) 
is characterized; a necessary and sufficient condition for the convergence of the d.f. of Z, 
is given under an assumption on the d.f. of X; . Let ®(z) be one of the three limiting d-f.’s 
of maxima of independent r.v.’s with a common d.f.; let A(y) be any d.f. such 
that lim,.o, A(y) = Oandlim,.. A(y) = 1. Then ad Jf. is a limiting df. of Z, if and only if 


is of the form Sc [@(z) dA(y). Let {[Y,:n = 1, 2, ---} be independent r.v.’s with the 


same marginal d.f.’s as X, ; suppose that W, = max (Y,, Y:,--- , Ys) has the limiting 
d.f. @(z), that is, there exist sequences {a,} and {b,} such that for all ¢, 


lim,.. P{aa'(W, — ba) S t) = &(t). 


Under this assumption a necessary and sufficient condition is given for 


2 
lim,.. P{a.'(Z, — bn) & | = [ [@(t)* dA (y). 
0 


For each integer & and real number u, let 
we(u) = P{X, > u, X2 > u, ++ , Xe > uf [P{Xi > ul]; 


then {us(u):k = 1, 2, ---} is a moment sequence which uniquely determines a d.f. A,(y). 
The condition is that the d.f.’s A,(y) converge completely to A(y) asu — ~. 


6. Elements of the Sequential Design of Experiments. Stuart A. Bess.er, 
Sylvania Electronic Defense Laboratories. 


An experimenter observes a physical phenomenon, the outcome of which depends upon 
some unknown parameter @ belonging to a finite parameter space 6. The experimenter 
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wishes to choose which of k-alternative hypothesis best describes the parameter @. To aid 
him in his decision he may perform experiments, ¢, selected from an infinite experiment 
space ¥. At each stage of the experimental process the experimenter must either stop ex- 
perimenting and choose a terminal action or continue experimenting in which case he must 
choose the next experiment. The “‘rule’’ which the experimenter uses in making these de- 
cisions will be called a sequential decision procedure. A sequential decision procedure is 
proposed and its optimal character is described. The procedure is demonstrated by apply- 
ing it to the problem of choosing which of three normal populations with common variances 
has the largest mean. Several other examples are discussed. A measure of efficiency is de- 
fined, and for each example the efficiency of a common alternative decision procedure is 
computed. 


7. Alias Sets of Error Vectors in the Theory of Error Correcting Group Codes. 
R. C. Bose, University of North Carolina and Case Institute of Tech- 
nology. 


Consider an n X r parity check matrix A, of rank r whose elements belong to the Galois 
field GF (s), s = p™. The letters of the code consist of all n-place row vectors y for which 
yA = 0. Suppose y is transmitted over an s-ary channel, and the output is y + ¢, « = 
(e; , €2, «++ , @n). Then ¢ is the error vector. Let a , a: , «++ , an be the row vectors of A. 
The 2’ vectors for which e;a; + és, + --: + €na, has a constant value may be said to form 
an alias set. Let Q be the set of error vectors which we wish to correct with certainty. Then 
no alias set should contain more than one member from @. Subject to this condition one 
would like to maximize n for a given r. This principle is of very wide application. For ex- 
ample, let s = 2, and let 2 consist of all vectors with one non-zero or two adjacent non-zero 
coordinates. Then we get Abramson’s (IRE Trans. Vol. IT5, 1959, pp. 150-157) single error 
and double adjacent error correcting (SEC-DAEC) code by choosing A such that the 
vectors a; , a2, *** , @n, a + a2, *** , Gat + a» COnstituting the set Q* are all distinet. 
For decoding we calculate (y + «)A = eA. If « belongs to @ then «A belongs to Q*, and 
uniquely determines ¢«. The required condition is satisfied if a; = (6, 1), 
i= 1,2,--- , 2° — 1, where §; is the coefficient vector of the (r — 2)th degree polynomial 
which represents the element z‘ of GF (2"-"), z being a primitive element. 


8. On Methods of Constructing Sets of Mutually Orthogonal Latin Squares 
Using a Computer. R. C. Bose, I. M. Cuakravarti, D. E. Knurn, Case 
Institute of Technology and University of North Carolina. (Invited Paper) 


This is in continuation of the work presented under the same title at the Midwestern Re- 
gional Meeting of IMS this year. The method is to start with module G(2, 2t) whose ele- 
ments are vectors z = (a, b) where a is a residue class (mod 2) and b is a residue class 
(mod 2t), the addition being defined by (a; , b:) + (a2, 62) = (c, d) where a; + a@ = c 
(mod 2), b} + bz: = d (mod 2t) and where P,{z;) = a; and P,{z,;) = 6b; and 
(0,0), ©, 1), --- (©, 2t — 1), (1,0) --- (1, 2¢ — 1) is the standard order. The existence of 
a set of m mutually orthogonal Latin squares based on a module G is known (Mann 1942) 
to be equivalent to the existence of a matrix X,..4, = ((z;;)) whose rows are elements of 
G and amongst the 4f differences of any two rows every element of G occurs once. The 
existence of X,..4: implies the existence of Ama: = ((a;)) = ((Pilzis])) where ay; = 0 or 1 


and in every two-rowed submatrix of A the four possible pairs (8), (°). (;) and (*) 


occur as columns with equal frequency ¢t. Starting with such a matrix A. which exists 
whenever a Hadamard matrix of order 4t exists, a programme was written for adjoining a 
second coordinate b;; to every a;; , where b;; belongs to the ring of residue classes (mod 21), 
so that a matrix X,,.4: = (a:;, 6:;) could be obtained. Fort = 3, this method yielded m <5 
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mutually orthogonal Latin squares of order 12. These results have also been generalized 
in othe directions for different orders. 


9. Best Fit to a Random Variable by a Random Variable Measurable with Re- 
spect to a c-Lattice. H. D. Brunx, University of Missouri. 


Let (©, §, «) be a probability space and f a random variable. Let £ be a sub-s-lattice of 
§. Then there is an £-measurable random variable g (for real t, {we Q:g(w) < t} e £) 
minimizing j (f — g)* dy (if appropriate integrals exist) in the class of £-measurable 
random variables. More generally, the squared difference may be replaced by the W. H. 
Young form A@(-,-) determined by an arbitrary convex function @: the £-measurable 
random variable g minimizing f Ae@(f, g) du in the class of £-measurable random variables 
is independent of # (assuming appropriate integrals exist). When & is a sub-c-field of §, 
then g is the conditional expectation H(f | £2). In special cases treated by van Eeden (Indag. 
Math., Vol. 19(1957), pp. 128-136, 201-211) and by Ayer, Brunk, Ewing, Reid, Silverman, 
and Utz (Ann. Math. Stat., Vol. 26(1955), pp. 641-647, 607-616, Pac. J. Math., Vol. 7(1957), 
pp. 833-846) g is the solution of a problem in maximum likelihood estimation of ordered 
parameters; in these cases the o-lattice £ is not a a-field. 


10. On the Non-null Distribution of the Studentized Difference between the 
Two Largest Sample Values (Preliminary Report). ANprf Croreau AND 
Jacques St-Pierre, University of Montreal. 


The non-null distribution of the difference between the two largest sample values has 
already been obtained by A. Zinger and J. St-Pierre (Biometrika, Vol. 45, Parts 3 and 4, 
December 1958, pp. 436-447) in the case of normal populations with known variances. In 
the case of unknown variances, the distribution of the studentized difference between the 
two largest sample values is obtained for three normal populations. The distribution takes 
the form of an iterated integral involving recurrence relations leading rather easily to 
numerical evaluations. A generalisation in the case of ‘‘n’’ populations is presently studied 
by the authors. 


11. Random Noise in Relay Control Systems. R. C. Davis, Convair Division of 
General Dynamics Corporation. 


A general method is developed to obtain the probability distribution of the error in a 
single closed-loop relay control system in which one controls a linear time-invariant dy- 
namic element in the presence of a time-varying signal perturbed additively by Gaussian 
noise. The noise is allowed to be of a particular nonstationary type and specifically is the 
output of a perfect amplifier with time variable gain in cascade with a linear time-invariant 
filter with a rational amplitude versus frequency response—the input being the derivative 
of a Wiener process. The method used is the development of the theory of a particular type 
of discontinuous Markoff process for which the corresponding analogy in heat conduction 
is the conduction of heat in a moving medium in which there is a surface of discontinuity in 
medium velocity. In this way both the transient and steady state probability distributions 
of error are obtained. The probability distribution of error is obtained explicitly and in- 
volves line integrals of the Gaussian probability density function in the phase space of the 
error and certain of its time derivatives. 


12. Sample Size for a Specified Width Confidence Interval on the Variance of a 
Normal Distribution. Frankuiin A. Graypitt AND Rosert D. Morrison, 
Oklahoma State University. (By title) 


If an experimenter decides to use a confidence interval to locate a parameter, he is con- 
cerned with at least two things: (1) Does the interval contain the parameter? (2) How wide 
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is the interval? In general, the answer to these questions cannot be given with absolute 
certainty, but must be given with a probability statement. The problem the experimenter 
then faces is the determination of the sample size n such that (A) the probability will be 
equal to 1 — a that the confidence interval contains the parameter, and (B) the probability 
will be equal to §* that the width of the confidence interval will be less than d units (where 
a, 8, and d are specified). 1 — a will be called the confidence coefficient, and * will be 
called the width coefficient. To solve this problem will generally require two things: (1) 
The form of the frequency function; (2) Some previous information on the unknown param- 
eters. This suggests that the sample be taken in iwo steps; the first sample will be used to 
determine the number of observations to be taken in the second sample so that (A) and 
(B) will be satisfied. For a confidence interval on the mean of a normal population with 
unknown variance this problem has been solved by Stein for 6* = 1. The purpose of this 
paper is to illustrate a method for determining n to satisfy (A) and (B) for the variance of 
a normal distribution. A set of tables is presented to which will be needed for the solution 
of this problem. 


13. On the Unbiasedness of Yates’ Method of Estimation Using Interblock 
Information. FRANKLIN A. GRAYBILL AND V. SesHapri, Oklahoma State 
University. (By title) 


In a balanced incomplete block model with blocks and errors random normal variables, 
Yates has shown that there are two independent unbiased estimates for any treatment 
contrast. These are referred to as intrablock and interblock estimators. Yates has also 
given a method for combining these two estimators which depend on the variance (un- 
known) and has shown how to estimate the variances from an analysis of variance. Since 
this combined estimator is used quite extensively, it seems desirable to study its properties. 
Graybill and Weeks have shown that Yates’ combined estimator is based on a set of minimal 
sufficient statistics and have presented an estimator which is unbiased. The purpose of this 
note is to show that Yates’ estimator, which is based on intrablock and interblock information, 
is unbiased. 


14. On the Distribution of the Ratio of the Largest of Several Chi-Squares to an 
Independent Chi-Square with Application to Ranking Problems. 8. 8. 
Gupta AND M. Sose.. Bell Telephone Laboratories. 


The distribution of xiux/xs and its upper percentage points are considered where “= 
is the maximum of p independent chi-squares and x} is a chi-square independent of the p 
others. A common number » of degrees of freedom is the principal case considered and 
tables of percentage points (25%, 10%, 5%, 1%) are given for » = 2(2)50 and p = 1(1)10; 
the case p = 1 which reduces to an F-distribution being used as a check. The computed 
tables have an application in the selection of a subset containing the “best” of several 
Gamma or Type III populations, i.e., the one with the largest scale parameter. In particu- 
lar, if several exponential populations are individually observed until exactly r failures 
are obtained from each then the above tables can be used for selecting a subset containing 
the one with the largest mean life. 


15. Expected Values of Normal Order Statistics. H. Leon Harrer, Wright- 
Patterson Air Force Base. 


A brief history is given of the development of the theory of order statistics and of past 
efforts to tabulate their expected values for samples from a normal population. A fuller 
account is given of the method of computation of a five-decimal-place table of the expected 
values of all order statistics for samples of size n from a normal population. Included is 
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such a table for n = 2(1)100 and for values of n, none of whose prime factors exceeds seven, 
up through n = 400. Also included is a discussion of an approximation proposed by Blom, 
and a table of values of the constant a required for this approximation for selected values 
of n, together with interpolation formulas for estimating a for other values of n. A discus- 
sion is given of actual and potential uses of the tables. 


16. Circular Error Probabilities. H. Leon Harter, Wright-Patterson Air Force 
Base. (By title) 


A problem which often arises in connection with the determination of probabilities of 
various miss distances of bombs and missiles is the following: Let z and y be two normally 
and independently distributed orthogonal components of the miss distance, each with 
mean zero and with standard deviations «, 2 o, . Now for various values of c = o,/e,z , it 
is required to determine (1) the probability P that the point of impact lies inside a circle 
with center at the target and radius Ke, , and (2) the value of K such that the probability 
is P that the point of impact lies inside such a circle. Solutions of (1), for c = 0.0(0.1)1.0 and 
K = 0.1(0.1)5.8, and (2), for the same values of c and P = 0.5, 0.75, 0.9, 0.95, 0.975, 0.99, 
0.995, 0.9975, and 0.999, are given, along with some hypothetical examples of the applica- 
tion of the tables. 


17. Comparison of Normal Scores and Wilcoxon Tests. J. L. HopGzs, Jr. anp 
E. L. Leumann, University of California, Berkeley. (By title) 


The normal scores test (i.e. the Fisher-Yates-c,-test or the van der Waerden X-test) 
and the Wilcoxon test have been proposed for testing the equality of two distributions 
against the ‘‘shift’’ alternative that the populations have distributions F(z) and F(z — @). 
From the known limiting behavior of the test statistics one obtains an expression for the 
asymptotic relative efficiency e(F) of Wilcoxon to normal scores. It is shown that 
0 Ss e(F) Ss 6/= for all F, and that all values including the endpoints may be attained. 


18. Minimal Sufficient Statistics for the Two-Way Classification Mixed Model 
Design. Rozert A. Huurquist AND Frankuin A. Graypiit, Oklahoma 
State University. (By title) 


A theorem proved by Rao and Blackwell reveals the importance of minimal sufficient 
statistics in point estimation problems. This theorem states: If Y is a vector of observa- 
tions, S is a minimal sufficient statistic for a vector of parameters @ and f(Y) is an unbiased 
estimate of g(@), then f(S) = E[f(¥) | 8) is also an unbiased estimate of g(@) based on S 
and such that variance f(Y) > variance h(S). We thus see that if we are interested in de- 
termining minimum variance unbiased estimators of variance components these estimators 
must be based on a minimal sufficient statistic. The objective of this paper is to exhibit 
minimal sufficient statistics for the two-way classification mized model design with unequal 
numbers in the subcells. 


19. Three-Quarter Replicates of 2° and 2‘ Designs. Perer W. M. Joun, Cali- 
fornia Research Corporation. 


Half replicates of 2° and 2‘ designs do not enable all the main effects and two-factor inter- 
actions to be estimated clear of two-factor interactions. Three-quarter replicates are ob- 
tained which give all main effects and two-factor interactions clear for the 2‘ design; for 
the 2° design main effects are clear and, if any one of the two-factor interactions is negligible, 
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the other two are clear. In each case, the effects are estimated by extracting half replicates 
from the design. 


20. On the Generalization of Sverdrup’s Lemma and Its Applications to Multi- 
variate Distribution Theory. D. G. Kane, Karnatak University. (By title). 
(Introduced by B. D. Tikkiwal) 


Tikkiwal and Kabe (Karnatak Univ. J., 1958) have giver analytic-cum-geometric proof 
of the Sverdrup’s lemma (Skand. Aktur. 1947). This lemma is now generalized for a p-vari- 
ate population. Let the vectors Xi = (2a , 2a, *** , Zin) for i = 1,2, --- , p have the den- 
sity {(XiX,, XiX;,---,X5X,, BX: , BX: , --- , BX,), then the density of XiX; = b,,, 
BX, = o(i,j = 1,2, --- , p), B being q X n matrix of rank q, is given by 


2-* T]?-. C(n — q — p + i)| BB’ |-# flbu , bu, ---, 
bpp a> "7s v»)| bi; — v, (BB’) tp | “Ha-e~p-1) 


C(n) being the surface area of a unit n-dimensional sphere. Almost all the distributions in 
multivariate theory have been derived by the help of this lemma. 


21. Approximations to Neyman Type A and Negative Binomial Distributions 
in Practical Problems (Preliminary Report). 8. K. Karri, Florida State 
University. 


The Neyman Type A and the Negative Binomial distributions have been used for fit- 
ting data arising from biological phenomena with varying degrees of success, e.g. G. Beall, 
“The Fit and Significance of Contagious Distributions when Applied to Observations on 
Larval Insects’, Ecology, Vol. 21 (1940), pp. 460-474 and C. I. Bliss and R. A. Fisher, 
“Fitting of Negutive Binomial Distribution to Biological Data”’, Biometrics, Vol. 9, pp. 
176-200. In the present work, it is shown that these distributions approximate to elementary 
distributions such as Poisson, Poisson with zeros added and Logarithmic in various regions 
of the parameter space. Preliminary fitting indicates that the elementary—and hence simple 
—distributions can be used with advantage as alternatives to these relatively complex 
distributions in many a practical situation. It is found that a reasonable judgment about 
the elementary distribution to be used can be made on the basis of the mean and the first 
frequency. 


22. Two Sample Nonparametric Tests for Scale Parameter (Preliminary Re- 
port). Jerome Kuiorz, University of California, Berkeley. 


Let X,,--- ,X, and Y,,--- , Y, be samples from populations with continuous distri- 
tions F and G. We are interested in tests of the hypothesis F = G that will be powerful 
against differences in scale when the populations are equivalent in location. Siegel and 
Tukey have recently devised a way to use the Wilcoxon statistic for this problem. The 
Pitman efficiency of their test for the normal case relative to the F-test is 6/r*. Pitman 
efficiency one relative to the F can be obtained for normality with the use of the following 
rank order statistic S. Assign weight [®@~'(i/N + 1)]* to the ith smallest observation in the 
pooled sample where N = m + n, and let S be the sum of these weights over the observa- 
tions from F. (Our weights are the squares of those used in Van der Waerden’s X-test for 
the corresponding location problem.) As a rank order statistic S has an exact null distribu- 
tion; we give a small table for the null distribution and approximations for m, n large. The 
Pitman efficiency of the test of Siegel and Tukey relative to the S-test can take on any 
value between 0 and ~ for different F. 
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23. Zero Correlation and Independence. H. O. Lancaster, University of Syd- 
ney. (By title) 


Let {z;} be a set of n random variables. Let orthonormal functions be defined on each 
marginal distribution such that z}” = 1 for j = 1, 2, ---, and so that {2f'?} forms a 
basis if z; has only a finite number (n;) of points of increase, i; = 1, 2, --- ,n; and {27} 
is a complete orthonormal set if z; has a general type of distribution. Let generalised co- 
efficients of correlation be defined, p!‘i) = Et II, z;“*i)}. These coefficients are ordinary 
coefficients of correlation if precisely two of the i; are non-zero. If more than two of the 
ij are non-zero, the generalized coefficients may not be less than unity in absolute value. 
Theorem: A necessary and sufficient condition for independence of the set {z,;} is that the 
generalized coefficients of correlation should all vanish. The bivariate case has been treated 
by Sarmanov, O. V., Doklady Akad. Nauk SSSR, Vol. 121 (1958), pp. 52-55 and Lancaster, 
H. O., Aust. J. Stat., Vol. 1 (1959), pp. 53-56 and J. Aust. Math. Soc. (in the press), in which 
the multivariate case of the theorem also is proved without restriction. 


24. Sequential Model Building for Prediction in Regression Analysis, I (Prelim- 
inary Report). Haroitp J. Larson anp T. A. Bancrorrt, Iowa State Uni- 
versity. 


Two different sequential procedures for deciding on the “‘length’’ of the linear regres- 
sion model to use for predictions are evaluated, both assuming the population variance 
a to be known. In the first procedure the experimenter fits all the independent variables 
available, sequentially tests the coefficients of the ‘‘doubtful’’ ones to be zero, and deletes 
from the model the terms whose coefficients do not differ significantly from zero. In the 
second procedure the experimenter fits a set of ‘“‘basic’’ variables he knows to be necessary, 
sequentially test the coefficients of the “‘nonbasic”’ variables to be zero and adds to his 
model those nonbasic variables whose coefficients differ significantly from zero. The ex- 
pected value and the variance of the estimator are discussed for each procedure and limited 
tables for certain specific values of the parameters are presented to allow explicit evaluation 
of the bias of the estimators. 


25. The Use of Sample Quasi-Ranges in Setting Confidence Intervals for the 
Population Standard Deviation. F. C. Leone, Y. H. RuTensBerc, anp 
C. W. Torp,* Case Institute of Technology and* Fenn College. 


The problem is the choice of an optimal selection method of quasi-ranges for setting one 
sided confidence bounds and confidence intervals for the standard deviation from a given 
distribution. The proposed methods of optimal selection are applied to random ordered 
samples from the normal, exponential and rectangular distributions. Tables of confidence 
bounds for the standard deviation of these distributions are given for confidence levels 
commonly used in statistical work. These are compared with the results of standard pro- 
cedures. 


26. On a Property of a Test for the Equality of Two Normal Dispersion Matrices 
Against One-sided Alternatives. Wapie F. Mixnar, University of North 
Carolina. 


The monotonic character, with respect to the variation of each noncentrality parameter, 
of the power function of the largest root test of normal multivariate analysis of variance or 
of independence between two sets of variates was proved in an earlier paper by 8. N. Roy 
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and the author. This paper obtains, using the same technique used before, similar results 
for four tests derived by 8S. N. Roy and R. Gnanadesikan for the equality of two dispersion 
matrices, in the normal multivariate set-up, against one-sided alternatives. 


27. A Note on Simple Sampling Plans (Preliminary Report). T. V. NaRayAaNa 
AND 8. G. Mowanty, Queen’s University. 


From previous work done by one of the authors, it is known that a simple sampling plan 
of size n can be characterised by a unique vector of n non-negative integers satisfying cer- 
tain conditions. A simple symmetric sampling plan of size n is defined as one in which the 
boundary points are symmetric about the line y = z. The following theorem is proved: 
The number of simple symmetric sampling plans of size n is (jm oaMt (n + 1)/2] 
where [z] is the largest integer contained in z. This theorem follows from known results on 
the number of compositions of an integer dominated by a given composition of this integer 
(Canad. Math. Bull., Vol. 1, No. 3). A recursive method is suggested to obtain the number 
of simple sampling plans of size n and the authors hope to establish the general result that 


the number of such sampling plans is ( 


n-—1 


28. On Sampling with Varying Probabilities and With Replacement in Sub- 
sampling Designs. J. N. K. Rao, Iowa State University. (Introduced by 
T. A. Bancroft) 


In sub-sampling, it is usual practice to select the primaries with replacement and with 
varying probabilities, due to difficulties in the theory of sampling with varying probabili- 
ties and without replacement. This leads to three different methods of selecting the second- 
aries. In method 1, if the ith primary is selected ; times, mA; secondaries are selected 
without replacement and with equal probabilities from the ith primary. In method 2, if 
the ith primary is selected \, times, A; sub-samples of size m, are independently drawn of 
each other from the ith primary with equal probability and without replacement, each sub- 
sample being replaced after it is drawn. In method 3, when the ith primary is selected \, 
times, a fixed size of m; is drawn from the ith primary with equal probability and without 
replacement and the estimate from the ith primary is weighted by \; . It is known that 
method 1 has smaller variance than method 2, and method 2 has smaller variance than 
method 3. But, the three methods have different expected costs, assuming that expected 
cost in a primary as proportional to expected sample size from the primary. Therefore it 
would appear more reasonable to compare the efficiency of the three methods for the same 
expected sample size. Here a comparison of the variances has been made for the same ex- 
pected sample size but the conclusions remain the same regarding efficiency. 


29. Some Results on Transformations in the Analysis of Variance. M. M. Rao, 
Carnegie Institute of Technology. 


The square-root and the logarithmic transformations are considered when the mean is 
large in each case. In the former the variance is assumed known, and in the latter the cor- 
responding assumption is that the coefficient of variation is small but the variance is un- 
known. In these cases, it is shown that the usual normal theory is applicable to test the 
hypotheses on means of the untransformed variables. These results extend those of E. G. 
Olds and N. C. Severo (These Annals, 1956). Sufficient conditions for the applicability of 
the normal theory are presented for a class of distributions depending on a finite set of 
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parameters with one parameter large, while the others, if any, are relatively small, or are 
confined to a fixed bounded set in the parameter space. 


30. Normal Approximation to the Chi-square and Non-central F Probability 
Functions. Norman C. Severo AND Marvin ZELEN, University of Buffalo 
and National Bureau of Standards. (By title) 


Let z, denote the 100p% percentage point of the normal distribution, i.e., @(z,) = 
. (2r)* exp(—4@)dt = 1 — p. It is shown that the 100p% percentage point of the Chi- 
square distribution with » degrees of freedom may be approximated by 


x>(») = vf{l — [2/(@v)] + (2» — h,)[2/19r)}}* 


where h, is an auxiliary function whose value may be obtained by linear interpclation in 
one of two short tables (one for » = 30, and one for 5 S » < 30) consisting of only 15 en- 
tries each. For values of p between .005 and .995, this improved ‘‘Wilson-Hilferty’’ approxi- 
mation gives results in error by at most .01 for » 2 30, and at most .05 for 5 s » < 30. 
Let P(F’ | »: , v2, ) denote the probability distribution function of the non-central F 
distribution with degrees of freedom »; and »; , and non-centrality parameter \. It is shown 
that P(F’ | » , v.,) = #(z), where 


»,F’ - 1 a 2 as 1 aa 2»; + 2a) 
n+ Ove 9(»; + a»)? 
2(r) + 2d) + af aw er 
96: + A)* Orn (on + A 
For values of the parameters investigated, the error of the approximation is at most .01. 


31. On a Geometrical Method of Construction of Cyclic PBIB (Preliminary 
Report). Estaer Semen, Northwestern University. (By title) 


An effective method of construction of cyclic PBIB is found provided that the number 
of treatments is 27" — 1, m a positive integer. Using the notation of R. C. Base and T. 
Shimamoto (‘Classification and analysis of partially balanced incomplete block designs 
with two associate classes,’ J.A.S.A., Vol. 47, 1952, the parameters are as fol- 
lows:0 = 2°*—1 b= (2™+2)(2"+1)/2 r= (2°+2)/2 k=2"-—-1 NY =1 YW =D 
nm, = 2°/2—2 ne=2"/2 am2/4—-3 Be 2/4—1 v= 2—1 b = 2"(2"— 1)/2 
r=2°/2 ke 2™°+1 A= 1 AHO 2 = V/2 n= 2"/2—2 a=IW/4 B= DW/4 . 
The construction makes use of the fact that in a projective plane with 2" + 1 points on a 
line there exists an effective construction of a Desarguessian plane based on a set of 2” + 2 
points of which no three are on one line. The problem whether such a construction is pos- 
sible if based on a non-Desarguessian plane is under investigation. 


32. Distribution of Quantiles in Samples from a Bivariate Population. M. M. 
Srpprqut, Boulder Laboratories. 


Let F(z, y) be the distribution function of (X, Y) possessing a pdf f(z, y). Let 
F(x) (pdf fi(z)) and F:(y) (pdf f2(y)) be the marginal distributions of X and Y respec- 
tively. Given two numbers F; and F, in (0, 1) let a and 8 be the numbers such that F;(a) = 
F, and F;(8) = F, . Assume that the first and second partial derivatives of F(z, y) are con- 
tinuous at (a, 8) and f(a, 8) = 0. 
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A random sample (X, , Ys), & = 1, --- , n is drawn and the values of X and Y are or- 
dered separately so that Xi < X;--- < Xa; ¥i < ¥:--- < ¥. Let é and j be the in- 
tegers such that i/n S F, < (i + 1)/n,j/n S Fo < (jf + 1)/n. Let M be the number of 
sample points (X, Y) such that X < Xiand Y < Y; . The joint distribution of (M, xX:, Y;) 
is obtained and it is shown that it is asymptotically trivariate normal. The asymptotic 
correlation coefficient between (X; , Y¥;) is given by 


p= ((F — FiFs)/(FP.P: — Fi)\Q— Fs), F = F(a,8). 


The statistic M/n has asymptotic mean F and variance of order n-'. This is used to set up 
confidence limits on p. A generalization to the asymptotic distribution of a set of quantiles 
in samples from a multivariate population is stated. 


33. Power Characteristics of the Control Chart for Means. Freperickx A. Soren- 
SEN, United States Steel Corporation Applied Research Laboratory. 


Methods are derived for the determination of the Type I error probability and the power 
of the control chart for sample means (no standard given). Under the null hypothesis, the 
process is assumed to be N (u, o*), where u and ¢ are unknown constau..s. Under the alterna- 
tives considered, the process is assumed to be N (4; , ¢*) during the time interval from which 
the ith subgroup (i = 1, ---, k) is taken. For k = 2, 3, 5, 10 and 25, and subgroup sizes of 
5 and 10, the power is tabulated with respect to two particular types of alternative be- 
lieved to be typical of those encountered in practice: (1) One of the mw; differs from the 
rest by an amount ée (single slippage); (2) Two of the y; differ from the rest by an equal 
amount 4c, but in opposite directions (symmetrical double slippage). The effect of using 
variable-width limits that produce a constant Type I error probability of 0.05 rather than 
using the traditional ‘‘three-sigma” limits is investigated. The power of the control chart 
is compared with that of the corresponding Model I analysis of variance test. 


34. A Set of Sufficient Statistics for Variance Components in a Two-Way Classi- 
fication Model With Unequal Numbers in the Subclasses. Davin L. 


WEEKS AND FRANKLIN A. GrayBILL, Oklahoma State University. (By 
title) 


One of the important methods of estimating variance components is by the analysis of 
variance (A.O.V.). The analysis of variance (A.O.V.) method of estimating variance 
components consists of obtaining an analysis of variance table, equating observed mean 
squares to expected mean squares and solving these equations for the estimates of the vari- 
ance components. If the model is Eisenhart’s Model II, then the A.O.V. method of estimat- 
ing variance components gives estimators which are unbiased. If the model also has equal 
numbers in all subclasses, and all random variables are normally and independently dis- 
tributed, the A.O.V. method gives unique, minimum variance, unbiased estimators. If the 
model is Eisenhart’s Model II with equal numbers in all subclasses, and if all random vari- 
ables are independently but not necessarily normally distributed, then the A.O.V. method 
of estimation gives estimators which are minimum variance, quadratic unbiased. However, 
if the model has unequal numbers in the subclasses, the problem is more complex. The 
A.O.V. method of estimation does not give minimum variance unbiased estimators in this 
case. The purpose of this paper is to exhibit a set of sufficient statistics for the general two- 
way classification model with unequal numbers in the subclasses. In particular, we show 
that the row totals, the column totals, and the intra-block error form a set of sufficient 


statistics for the variance components in a two-way classification model with unequal 
numbers in the subclasses. 
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35. Minimal Sufficient Statistics for Eisenhart’s Model II in a Class of Two-Way 
Classification Models. Davin L. Weeks anp FRANKLIN A. GRAYBILL, 
Oklahoma State University. (By title) 


The class of designs in which the number of experimental units per block is constant, and 
the number of observations per treatment is constant, is examined in order to determine 
a set of minimal sufficient statistics. This class of designs includes as a subset, the balanced 
incomplete block, and the partially balanced incomplete block designs. Eisenhart’s Model 
II under normal theory is assumed. The number of minimal sufficient statistics is expressed 
as a function of the distinct characteristic roots of the matrix NN’ where N is the incidence 
matrix of the design. The distribution of each statistic is given and pairwise independence 
investigated. In the case of the BIB and GD-PBIB’s, the statistics are defined explicitly 
in terms of quantities normally calculated in the analysis of variance. Instructions as to 
how the statistics may be computed easily for the case of the GD-PBIB’s is also given. 


36. Two New Continuous Sampling Plans. Joun S. Wuire, General Motors 
Technical Center Research Labs. 


Two new continuous sampling plans are proposed. Both plans are variations of the con- 
cepts used by Dodge and Torrey (Ind. Qual. Cont., Jan., 1951) in CSP-2 and CSP-3. CSP-3 
differs from CSP-2 in that following the discovery of a defective unit during sampling in- 
spection, the next four units submitted must be inspected and found non-defective if sam- 
pling inspection is to continue. A new plan, called CSP-2.1, is proposed which requires only 
that the next unit after the defective pass inspection rather than the next four, in order 
that sampling inspection be continued. In this notation, the original CSP-2 might be de- 
noted as CSP-2.0 and CSP-3 as CSP-2.4. A second plan is given which does require the 
inspection of four units following the discovery of a defective during sampling inspection 
but which eliminates the spacing number (i.e. k = 0). Graphs giving contours of constant 


sampling frequency in the AOQL and i = clearance number plane are provided for both 
plans. 


37. On a Class of Covariance Kernels Admitting a Power Series Expansion 
(Preliminary Report). N. DonaLp YivisakEr, Columbia University. 


Let K(T) denote the class of covariance kernels defined on JT X T and let 
a= (ao, a, -«**) be a sequence of nonnegative real numbers. The mapping 
¢a:K — > jy ajKi maps B, = {K ¢ K(T)| So ajKi(s, 8) < @ for all se T} into K(T). 
This paper discusses these mappings and in particular the reproducing kernel space asso- 
ciated with the kernel »,(K) is studied relative to the reproducing kernel space associated 
with the kernel K. Some applications of these results are noted in reference to problems of 
mean value estimation under the model Y(t) = m(t, 8) + X(t), te T,8eAC R, , where 


X(-) is assumed to be Gaussian process with mean function zero and known covariance 
kernel. 


38. A Calculus for Factorial Arrangements. M. ZeELEN anv B. Kurxsian, Na- 
tional Bureau of Standards and Diamond Ordnance Fuze Laboratories. 


A calculus complete with special axioms, operations, and rules of formation is formally 
defined with respect to factorial arrangements. The object of this calculus is to permit easy 
manipulation of complicated mathematical operations. Its use enables large order matrix 
operations to be carried out using logical operations. 





ABSTRACTS 


CORRECTION TO ABSTRACTS 


“‘Semi-Markov Processes: Countable State Space” and “Stationary Probabilities 
for a Semi-Markov Process with Finitely Many States 


By Ronaup Prke 
Columbia University 


The titles of the above-named abstracts, numbers 55 and 74 on pages 240 and 
245—46 respectively in the March 1960 Ann. Math. Stat., were reversed in print- 
ing. Therefore, “‘Semi-Markov Processes: Countable State Space” applies to 
abstract 55 and “Stationary Probabilities for a Semi-Markov Process with 
Finitely Many States” applies to abstract 74. 





NEWS AND NOTICES 
Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 


Dr. Leo Aioian, recently Head of the Mathematics Section and Senior Mathe- 
matical Consultant at Hughes Aircraft Company, has accepted a position as 
member of the Technical Staff at Space Technology Laboratories. 

Oskar N. Anderson, the University of Munich, Germany died on February 
12, 1960. 

George E. P. Box has left his post as Director of the Statistical Techniques 
Research Group at Princeton University to take up an appointment at the Uni- 
versity of Wisconsin as Professor of Statistics. 

Mr. Roshan L. Chaddha has completed his Ph.D. degree at Virginia Poly- 
technic Institute. He is joining the Department of Statistics at Kansas State 
University and will be located at Manhattan, Kansas during the next year. The 
dissertation topic for the Ph.D. degree was ‘‘An Inventory Control Problem with 
Regular and Emergency Demands”’. 

Georges Darmois, professor at the Institut de Statistique of the University of 
Paris, died on January 3, 1960. 

Norman R. Draper has left Imperial Chemical Industries (Plastics Division) 
England to spend the academic year 1960-61 at the Army Mathematics Research 
Center, Madison, Wisconsin. 

Dr. Seymour Geisser has returned to the National Institute of Mental Health, 
Washington, D. C. after having served as visiting Associate Professor at the 
Iowa State University during the Spring Quarter, 1960. 

H. 8. Graf has joined The Teleregister Corporation, Stanford, Connecticut. 

Professor E. J. Gumbel (Columbia University) will participate at the June 
meeting of the International Statistical Institute in Tokyo and will give papers 
on the theory of extreme values at the Universities of Kyoto, Osaka, Tokyo (In- 
stitute of Technology), and Manila. During July he will give a course on mathe- 
matical statistics at the Chulalongkorn University, Bangkok, Thailand. 

Dr. John Gurland, Professor in the Statistical Laboratory and Department of 
Statistics, lowa State University, has been awarded a travel grant by the Com- 
mittee on International Conference Travel Grants of the American Statistical 
Association, to attend the Biometric Society Symposium on Quantitative Meth- 
ods in Pharmacology at the University of Leyden (Netherlands) May 10-13. 
Dr. Gurland will act as the official representative of the American Statistical 
Association at this Symposium, and will also present a paper entitled “Deter- 
mination of Minute Insecticidal Residues Through Biological Assay.”’ 

Stuart T. Hadden has joined the Process Engineering Department of The 
Dow Chemical Company as a Systems Engineer. 

F. M. Hemphill, Ph.D., formerly Professor of Public Health Statistics, School 
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of Public Health, The University of Michigan, was recently commissioned in 
the Regular Corps United States Public Health Service. He is now on duty as 
Scientist Director serving as Chief of the Statistical Design and Analysis Section 
of the Statistics and Analysis Branch of the Division of Research Grants, Na- 
tional Institutes of Health, Bethesda, Maryland. 

Mark L. Hinkle, Jr., is currently a member of the Reliability and Maintaina- 
bility Unit in General Electric Company’s Light Military Electronics Depart- 
ment, 901 Broad Street, Utica, New York. 

Palmer O. Johnson, professor of education and shairman of the department of 
statistics at the University of Minnesota, died at the age of 68 on January 24, 
1960. 

Shriniwas Keshavarao Katti completed requirements for the Ph.D. Degree in 
statistics at the Iowa State University in January 1960. He has joined the staff 
of the new Department of Statistics at the Florida State University, Tallahassee, 
Florida, as Assistant Professor of Statistics. 

Dr. J. H. B. Kemperman, who was engaged last year at the University of Am- 
sterdam (Netherlands), has been promoted to Professor in the Department of 
Mathematics and Statistics at Purdue University, Indiana. 

A. I. Khinchin died at the age of 65 on November 18, 1959. Khinchin had 
studied and taught at the University of Moscow and V. A. Steklov Mathematical 
Institute. 

Gilbert Lieberman, formerly a mathematician with the Naval Ordnance Lab- 
oratory, Silver Spring, Maryland, is now a Senior Engineer with The Radio 
Corporation of America, Camden, New Jersey. 

John W. Mayne, since March 1959, has been Chief of the Operational Research 
Section at Supreme Headquarters Allied Powers Europe (SHAPE), Paris, 
France. This section is part of the SHAPE Air Defence Technical Centre’s 
System Evaluation Group, and is concerned mainly with air defense problems of 
Allied Command Europe. 

Robert H. Morris has been named associate director of the newly formed 
Business Operations Analysis Staff at Eastman Kodak Company, Rochester 4, 
New York, whose purpose is applying scientific and mathematical techniques 
as aids in analyzing a wide range of business problems. 

Dr. Stanley W. Nash of the University of British Columbia has been ap- 
pointed as Visiting Associate Professor at the Statistical Laboratory, Iowa 
State University for a period of one year beginning July 1, 1960. 

José R. Padré, Assistant Professor at the University of Puerto Rico, received 
a Ph.S. in Mathematics from St. Louis University in June, 1960. His dissertation 
was written under the direction of Dr. Waldo A. Vezeau. Dr. Padré has been on 
leave-of-absence and sponsored by the University of Puerto Rico while doing 
his graduate studies. He will resume teaching at the Department of Mathematics, 
University of Puerto Rico, Rio Piedras, Puerto Rico. 

B. E. Phillips has been made Assistant Technical Director, Reliability, Ground 
Support Equipment in The Martin Company, Baltimore, Maryland. 
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Donald M. Roberts has received his Ph.D. Degree at Stanford University and 
has accepted an assistant professorship in the Mathematics Department at the 
University of Illinois. 

Ernest M. Scheuer, Space Technology Laboratories, Inc., has received the 
Ph.D. in mathematics from U. C. L. A. 


Earl A. Thomas, formerly Technical Advisor, Ballistic Missiles Division, 
Burroughs Corporation, has joined the staff of The Institute for Defense Analysis. 
Vernon E. Weckwerth has returned as lecturer and Administrative Director 
of the third Graduate Session of Statistics in the Health Sciences at the Uni- 
versity of Minnesota. Mr. Weckwerth was head of Research and Statistics for 
the American Hospital Association in Chicago and Assistant Director of the 


Hospital Research and Educational Trust. He also taught at Northwestern 
University last fall. 


—aaa 


NEW MEMBERS 


The following persons have been elected to membership in the Institute 


Andrews, Horace P., Ph.D. (Pennsylvania State University); Head Statistics Division- 
Research Laboratories, Swift and Company, U.S. Yards, Chicago 9, Illinois. 

Balintfy, Joceph L., Diplomas in English and Economies, (University of Technical Sciences, 
Budapest); Assistant Professor, Tulane University, School of Business Administration, 
New Orleans 18, Louisiana. 

Baxter, Colin Benjamin, B.S. (University of Sheffield, England); Lecturer in Mathematics, 
Harrow Technical College, Watford Road, Northwick Park, Harrow-on-the-Hill, Mid- 
dlesex, England; 25, Swallowbeck Avenue, Lincoln, England. 

Beale, Evelyn M. L., B.A. (Cambridge University); Member of Mathematics Group, Ad 
miralty Research Laboratory, Teddington, Middlesex, England. 

Berens, Alan Paul, M.S. (Purdue University); Graduate Assistant, Purdue University, 
West Lafayette, Indiana. 

Blischke, Wallace R., M.S. (Cornell University); National Science Foundation Cooperative 
Graduate Fellow, Cornell University, Department of Plant Breeding, Ithaca, New York. 

Chakravarti, Indra Mohan, D. Phil. (Sc). (University of Calcutta); Visiting Assistant Pro- 
fessor, University of North Carolina, Department of Statistics, Chapel Hill, North Carolina. 

Chow, Yuan S., Ph.D. (University of Illinois); Research Staff Mathematician, 7.B.M. Re- 
search Center, Yorktown Heights, New York. 

Collins, Gwyn, B.Sc. (Nottingham); Research Associate, Advertising Research Foundation, 
3 E. 54th Street, New York City 22, New York; 63 E. 9th Street, New York City 3, New 
York. 

Dagen, Herbert B., B.A. (City College of New York); Chief, Statistical Operations Division, 
U. 8. Army Chemical Corps, Quality Assurance Technical Agency, Army Chemical 
Center, Maryland; 8710 Bowers Avenue, Baltimore 7, Maryland. 

Davidson, Harold, Ch. E. (Columbia University); Staff Engineer, I. B. M. Corporation- 
IPC-ASDD, Yorktown Heights, New York; 455 E. 14th Street, New York 9, New York. 

Farquhar, Thomas H., S.B. (Massachusetts Institute of Technology); Student, Massachu 


setts Institute of Technology. Cambridge 39, Massachusetts; 22 Magazine Street, Cam- 
bridge 39, Massachusetts. 
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Foster, William F., B.S. (U. 8. Naval Academy); Lieutenant United States Navy, Student, 
Postgraduate School, Monterey, California, 880 Arlington Place, Monterey, California. 

Gnedenko, Boris, Member Academie of Science of Ukrainien SSR; Professor of Mathemat- 
ics, Chief Statistical Department, Mathematical Institute of Academie of Science, Kali- 
nine Pl. 6, Kiev, USSR; Sverdeor Str. 10, App 2g, Kiew 8, USSR. 

Gulotta, Charles William, A.B. (Hunter College); Programmer, Great American Insurance 
Company, 99 John Street, New York, N. Y., 95-85 114th Street, Richmond Hill 18, New 
York. 

Hora, Rajinder Bir, M.S. (Panjab University, India); Associate Engineer, Boeing Airplane 
Company and Graduate Student at University of Washington, Boeing Airplane Com- 
pany, Renton, Washington; 4738 - 16th North East, Seattle 5, Washington. 

Kappel, Joseph George, M.S. (University of Illinois); Assistant, Department of Mathe- 
matics, University of Illinois, Urbana, Illinois; 306 8. Urbana Avenue, Urbana, Illinois. 

Katti, Shriniwas K., Ph.D. (Iowa State University); Assistant Professor, Florida State Uni- 
versity, Tallahassee, Florida. 

Kenworthy, Orville O., M.S. (Oklahoma State); Administrative Assistant, Ferro Corpora- 
tion, 4150 E. 56th Street, Cleveland 5, Ohio. 

King, R. Maurice, Jr., B.S. (University of North Carolina); Experimental Statistician, 
American Cyanamid Company, Stamford Labs, 1937 W. Main Street, Stamford, Con- 
necticut; 52 Sinawoy Road, Cos Cob, Connecticut. 

Kirchgiissner, Klaus, Dr. rer. net., (Nat.-Math. Fakultat der Universitat Freiburg i.Br., 
Deutschland); Scientific Assistant, Institut fir Angewandte Mathematik der Universitat 
Freiburg, i.Br., Freiburg i.Br., Deutschland, Priedrichstr. 87. 

Ku, Hsien H., M.S. (Purdue University); Mathematician, National Bureau of Standards, 
Department of Commerce, Washington, 25, D. C.; 5439-30th Place, N.W., Washington, 
D.C. 

Laue, Richard V., MS. (Rutgers University); Statistician, Bell Telephone Laboratories, 
Murray Hill, New Jersey. 

Lee, Ray H., M.S. (Stanford University); Chief Mathematician, Aulometric Corporation, 
1501 Broadway, New York City, New York. 

Levin, Morris J., Ph.D. (Columbia University); Engineering Scientist, Missile Electronics 
and Controls Division, Radio Corporation of America, Burlington, Massachusetts; 
370 Concord Avenue, Cambridge, Massachusetts. 

McGuire, Charles Bartlett, M.A. (University of Chicago); Economist, Rand Corporation, 
1700 Main Street, Santa Monica, California. 

Mandis, George A., B.S. (Roosevelt University of Chicago); Member Operations Research 
Analysis, Grumman Aircraft Engineering Corporation, Bethpage, L. I., New York; 
George A. Mandis and Associates, 6 Wooleys Lane, Great Neck, New York. 

Marques Henriques, José Manuel, (Lisbon School of Economics); Student, Lisbon School 
of Economics (Instituto Superior de Ciencias Economicas e Financeiras), R. do Quelhas, 
6, Lisbon, Portugal; R. do Sol, ao Rato, 57, 2° Esq’, Lisbon 2, Portugal. 

Mehr, Cyrus B., M.S. Industrial Engineering (Purdue University); Instructor and part- 
time Student, Purdue University, West Lafayette, Indiana; 207-11 Airport Road, Weat 
Lafayette, Indiana. 

Mohanty, Sri Gopal, M.A. (Punjab University, India); Graduate Student, Department of 
Mathematics, University of Alberta, Edmonton, Canada. 

Montzingo, Lioyd J., Jr., M.A. (University of Buffalo); Instructor, University of Buffalo, 
Buffalo 14, New York. 

Murray, Charles W. Jr., B.A. (Duke University); Engineer, Melpar, Inc., 3000 Arlington 
Blvd., Falls Church, Virginia; 109 Chapel Drive, Annandale, Virginia. 

Neuts, Marcel Fernand, M.S. (Stanford University); Graduate Student, Department of 
Statistics, Stanford University, Stanford, California; 2365 Waverly Street, Palo Alto, 
California. 

Nielsen, Aage Volund, M.8. Chem. Eng. (Technical University of Denmark); Civilingenior, 
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Statistical Institute of University of Copenhagen, Denmark; Norre Alle 75, Room 544, 
Copenhagen O, Denmark. 

Novick, Melvin R., M.A. (Roosevelt University); Student Research Assistant, Department 
of Statistics, University of North Carolina, Chapel Hill, North Carolina; 102C Isley 
Street, Chapel Hill, North Carolina. 

Okana, K. Frederick, M.A. (University of Minnesota); Mathematical Statistician, National 
Aeronautics and Space Administration (AAR) 1520 H. Street, N.W., Washington 25, D.C. 

Pandharipande, Vikas Raghunath, Int. in Science (Nagpur University, India); % Mrs. K. 
Pandharipande, Advocate, Ramdas Peth, Nagpur, India. 

Posten, Harry O., M.S. (Kansas State University); Ph.D. Candidate, Virginia Polytechnic 
Institute, Blacksburg, Virginia. 

Rao, U. V. Ramamohana (U. V. R.) M.A. (Andhra University, India); Graduate Assistant, 
Department of Mathematics, Indiana University, Bloomington, Indiana. 

Ray, Sudhindra Narayan, M.S. (Calcutta University, India); Student, University of North 
Carolina, Chapel Hill, N. C., Department of Statistics; 308 Connor Dorm., University 
of North Carolina, Chapel Hil, N.C. 

Richardson, Earle Wesley, Jr., A.B. (Georgetown University); Student, American Univer- 
sity, Washington 16, D. C.; 8109 44th Street, N.W., Washington 16, D. C. 

Schwarz, Gideon E., M.Sc. (Hebrew University); Graduate Student, Department of Mathe- 
matical Statistics, Columbia University, New York 27, N.Y. 

Shy, William H., M.A. (University of Georgia); Research Analyst, Kimberly-Clark Corpora- 
tion, Neenah, Wisconsin; P. O. Bor 22, Neenah, Wisconsin. 

Singh, Shorh Nath, M.A. (Banaras Hindu University, India); Graduate Student, Depart- 
ment of Statistics, University of California, Berkeley, California. 

Soriano, Abraham, B.S. (Rensselaer Polytechnic Institute); Industrial Engineering De- 
partment, Johns Hopkins University (Hospital), Baltimore, Md. ; 3301 Saint Paul Street, 
Baltimore 18, Maryland. 

Sternberg, I. Paul, M.S. (Rutgers University); Director, Quality Control, Whittaker Gyro, 
Division of Telecomputing Corporation, 16217 Lindbergh Street, Van Nuys, California. 

Sukhatme, (Mrs.) Shashikala, M.S. (University of Poona, India); Graduate Assistant, 
Department of Statistics, Michigan State University, East Lansing, Michigan. 

Susco, Dante V., M.A. (University of California, Los Angeles); Staff Member, Los Alamos 
Scientific Laboratory, P.O. Box 1663, Los Alamos, New Mexico. 

Umegaki, Hisaharu, Ph.D. (Mathematical Institute, Tohoku University, Japan); Assistant 
Professor, Department of Mathematics, Tokyo Institute of Technology, Oh-okayama, 
Meguro-ku, Tokyo, Japan; Akasaka Aoyama Minamicho 6-108, Minato-ku, Tokyo, Japan. 

Van Dyke, John, M.A. (Michigan State University); Assistant Instructor, Department of 
Statistics, Michigan State University, East Lansing, Michigan. 

Wolfe, John H., B. S. (California Institute of Technology); Student, Department of Psy- 
chology, University of California, Berkeley, California; 2299 Piedmont, Berkeley 4, Cali- 

fornia. 

Wu, Shih Yen, M.A. (Northwestern University); Assistant Professor of Economics, Los 
Angeles State College, Los Angeles $2, California. 


SE 


PRELIMINARY ACTUARIAL EXAMINATIONS PRIZE AWARDS 
ANNOUNCED 


The winners of the prize awards offered by the Society of Actuaries to the 
nine undergraduates ranking highest on the score of the General Mathematics 
Examination of the 1960 Preliminary Actuarial Examinations are as follows: 

First Prize of $200 


Gitlin, Todd A. Harvard University 
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Additional Prizes of $100 each 


Emerson, William R. California Institute of Technology 
Goodman, Richard H. Harvard University 

Landman, Maurice A. Harvard University 

Lorden, Gary A. California Institute of Technology 
McDonnell, Robert N. University of Chicago 

Newmeyer, John A. California Institute of Technology 
Sampson, Schuyler 8. Bowdoin College 

Shulsky, Abram N. Cornell University 


The Society of Actuaries has authorized a similar set of nine prizes for 1961. 
Beginning in 1961, the Preliminary Actuarial Examinations will consist of two 
examinations: The General Mathematics Examination (based on the first two 
years of college mathematics), and The Probability and Statistics Examination. 
The 1961 Preliminary Actuarial Examinations will be prepared by the Educa- 
tional Testing Service under the direction of a committee of actuaries and mathe- 
maticians, and will be administered by the Society of Actuaries at centers 
throughout the United States and Canada on November 16, 1960 and on May 10, 
1961. The closing date for applications is April 1, 1961. Further information 
concerning these Examinations can be obtained from the Society of Actuaries, 
208 South LaSalle Street, Chicago 4, Illinois. 


ee 


RESEARCH FELLOWSHIPS IN PSYCHOMETRICS OFFERED 


The Educational Testing Service is offering for 1961-62 its fourteenth series of 
research fellowships in psychometrics leading to the Ph.D. degree at Princeton 
University. Open to men who are acceptable to the Graduate School of the 
University, the two fellowships each carry a stipend of $3,750 a year, plus an 
allowance for dependent children. These fellowships are normally renewable. 
Fellows will be engaged in part-time research in the general area of psychological 
measurement at the offices of the Educational Testing Service and will, in 
addition, carry a normal program of studies in the Graduate School. 

Suitable undergraduate preparation may consist either of a major in psychol- 
ogy with supporting work in mathematics, or a major in mathematics together 
with some work in psychology. However, in choosing fellows, primary emphasis 
is given to superior scholastic attainment and research interests rather than to 
specific course preparation. 

The closing date for completing applications is January 6, 1961. Information 
and application blanks will be available about September 15 and may be ob- 
tained from: Director of Psychometric Fellowship Program, Educational 
Testing Service, 20 Nassau Street, Princeton, New Jersey. 


a ae 


INSTITUTE OF MANAGEMENT SCIENCES HOLDS MEETING 


The Seventh International Meeting of the Institute of Management Sciences 
will be held at Hotel Roosevelt, New York City, October 20-22, 1960. Among 
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the topics for sessions of the meeting are the following: information processing 
and management science, computors and simulation techniques, and mathemati- 
cal methods for management science. Further information may be obtained from 
Mr. James Townsend, Union Carbide Corp., 270 Park Avenue, New York 17, 
se 


rr 


SYMPOSIUM ON MATHEMATICAL OPTIMIZATION TECHNIQUES 


A Symposium on Mathematical Optimization Techniques will be held on 
October 18, 19, and 20, 1960 at the University of California, Berkeley. The 
Symposium is sponsored jointly by the University of California and RAND 
Corporation, with twenty invited speakers presenting papers during the three- 
day sessions on such topics as linear, non-linear and discrete programming, 
variational processes: adaptive and stochastic, optimal decision processes, and 
optimum networks and structures. For further information, write to Robert M. 
Oliver, Department of Industrial Engineering, University of California, Berkeley 
4, California. 


— 


UNIVERSITY OF MINNESOTA STATISTICS DEPARTMENT 
ESTABLISHED 


During the academic year 1958-1959 the Department of Statistics was estab- 
lished at the University of Minnesota. The Department has supplemented and 
coordinated the statistical activities of the University—graduate curriculum, 
research, and consulting. The Department’s organization involves direct ap- 
pointments as well as joint appointments in mathematics and the sciences. 
Following is the current staff of the Faculties of Statistics: Statistics: L. Hurwicz, 
I. Olkin, D. Richter, I. R. Savage, M. Sobel; Mathematics: G. Baxter, 
M. Donsker, B. Lindgren, 8. Orey, W. Pruitt, E. Reich, F. Spitzer; Agriculture: 
R. Comstock, C. Gates; Biostatistics: J. Bearman, J. Berkson, B. Brown, E. 
Johnson, R. McHugh; Business Administration: D. Hastings, J. Neter; Industrial 
Engineering: G. McElrath. 


a 


DOCTORAL DISSERTATIONS IN STATISTICS, 1959 


The following listing was inadvertently omitted from the June 1960 issue of the 
Annals: 

William Leonard Harkness, Michigan State University, major in statistics, 
“An Investigation of the Power Function for the Test of Independence in 2 X 
2 Contingency Tables.” 
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REPORT OF THE NEW YORK CITY MEETING OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 


The eighty-fourth meeting of the Institute of Mathematical Statistics was 
held at Teachers College, Columbia University, on April 21-23, 1960, in conjune- 
tion with a meeting of the Biometric Society (Eastern North American Region). 
Program chairman for the meeting was Rosedith Sitgreaves, Teachers College; 
Ronald Pyke, Columbia University, was Assistant Secretary. In addition to 
technical sessions there were an informal party at the Columbia University 
Men’s Faculty Club, and two coffee hours. 

211 persons, including 142 members of the Institute registered for the meeting. 
The program of the meeting was as follows. All sessions for invited papers were 
jointly sponsored by the Institute and the Biometric Society. 


THURSDAY, APRIL 21, 1960 
10:00-12:00 a.m.—Inventory and Queueing Theory 


Chairman: Howarp Levene, Columbia University. 
1. ‘A Multi-Server Queueing Problem,’’ Peter E. Ney, Cornell University. 
2. “Queues in Series,’”’ Jenome Sacks, Columbia University. 
3. “On the Transient Behavior of a Queueing Process with Batch Service,’’ Lasos Takacs, 
Columbia University. 


1:30-2:30 p.m.—Invited Address 


Chairman: Borp HarsusparcGer, Virginia Polytechnic Institute. 
1. “Two Methods of Constructing Exact Tests,’’ James Durnin, University of North Caro- 
lina. 


2:30-4:30 p.m.—Selected Topics I 


Chairman: C. Y. Kramer, Virginia Polytechnic Institute. 

1. “On Comparing Different Tests of the Same Hypothesis,’’ H. A. Davip anv Carmen A. 
Perez, Virginia Polytechnic Institute. 

2. “On the Replacement of Periodically Inspected Equipment,’’ C. Denman, Columbia 
University. 

3. “Lower Bounds on the Probability Associated with Certain Confidence Regions for the 
Multivariate Median,’ Ernest M. Scuever, U.C.L.A. and Space Technology Labora- 
tories, Inc. 


2:30-4:30 p.m.—Contributed Papers I 


Chairman: Mitton Sose., Bell Telephone Laboratories. 

1. “‘A Noiseless Comma-Free Coding Theorem,” Tuomas 8. Ferauson, U.C.L.A. and 
Princeton University. 

2. “On Centering Infinitely Divisible Processes,’’ Ronaup Prxe, Columbia University. 

3. “Inference About Non-Stationary Markov Chains,’”’ Rutu Z. Goip, Columbia Univer- 
sity. 

4. “On Linear Estimation of a Single Parameter of a Mean Function under Second Order 
Disturbance”’ (Preliminary Report), N. Donatp Yivisaxer, Columbia University. 

5. “‘ Asymptotic Shapes of Optimal Stopping Regions for Sequential Testing, (Preliminary 
Report), Ctpgon Scuwanrz, Columbia University (Introduced by T. W. Anderson). 
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6. ‘The Asymptotic Power of the Kolmogorov Tests of Goodness of Fit,’’ Dana Quang, Uni- 
versity of North Carolina. 


FRIDAY, April 22, 1960 
9:00-10:30 a.m.—Problems in Multivariate Analysis 


Chairman: I. BLumen, Cornell University. 


1. ‘‘Multivariate Analysis in Psychology and Education,’’ Rote BARGMANN, Virginia Poly- 
technic Institute. 


2. ‘Multivariate Experimental Designs,’’ Harry Rosenswarr, Federal Aviation Agency. 
9:00-10:30 a.m.—Contributed Papers II 


Chairman: Marvin Zevcen, National Bureau of Standards. 
1. “Efficient Sequential Estimators with High Precision Only in a Small Interval,’’ ALLAN 
Brrnpaum, New York University. 
. ‘Asymptotic Variance as an Approximation to Expected Loss for Maximum Likelihood 
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